Diagnosing growth in low-grade gliomas with and without artificial intelligence-measured longitudinal volume measurements: A retrospective observational study

Hassan M Fathallah-Shaykh; Houman Sotoudeh; Markus Bredel; Alex Whitley; Jinsuh Kim; Fanny E Morón; Fabio Raman; Nidhal Bouaynaya; Hayat Rahal

doi:10.1093/noajnl/vdaf271

. 2026 Jan 6;8(1):vdaf271. doi: 10.1093/noajnl/vdaf271

Diagnosing growth in low-grade gliomas with and without artificial intelligence-measured longitudinal volume measurements: A retrospective observational study

Hassan M Fathallah-Shaykh ^1,^✉, Houman Sotoudeh ^2,^✉, Markus Bredel ³, Alex Whitley ⁴, Jinsuh Kim ⁵, Fanny E Morón ⁶, Fabio Raman ⁷, Nidhal Bouaynaya ⁸, Hayat Rahal ⁹

PMCID: PMC12909261 PMID: 41710546

AbstractAbstract

Background

Low-grade or grade 2 diffuse gliomas (LGG) infiltrate the brains leading to significant neurological morbidity. This retrospective observational study evaluates the ability of AI-assisted volumetric analysis to correctly detect tumor growth in longitudinal studies of LGG as compared to the standard clinical method.

Methods

A total of 56 gliomas and 7 stable FLAIR lesions were included; gliomas were classified as clinical progression (n = 34), or clinically stable (n = 22). All gliomas were from radiation-naïve patients; only 2 patients had completed treatment with temozolomide. The dates of tumor growth were gathered from clinical notes. Longitudinal tumor volumes were calculated by the MRIMath FLAIR AI. Golden truths were obtained by physician reviews using the MRIMath Smart contouring system. Growth by significant shifts in tumor volumes was detected by using the statistical method of online change-of-point method.

Results

In the clinical progression group, automatic AI segmentation followed by human review detected tumor growth at a median of 21 months earlier than visual inspection. In the clinically stable group, AI with human review identified growth in 13/22 cases at a median of 23 months earlier than the last magnetic resonance imaging. AI without human review generated similar results but with a 25% false positive and an 8.33% false negative rate. The median time spent by physicians in reviewing, revising, and approving the AI segmentations is 2 minutes.

Conclusions

These findings highlight the clinical potential of AI-assisted volumetric analysis followed by physician oversight for the timely detection of tumor progression in LGG patients.

Keywords: artificial intelligence, longitudinal imaging, low-grade glioma, volumetric analysis

Key Points.

AI-supported, physician-reviewed volumetric analysis of low-grade gliomas enables earlier detection of tumor growth compared with standard clinical practice.
A National Cancer Institute-funded, 510(k)-cleared AI-supported online device integrates efficiently into routine neuro-oncology workflows.
AI-generated contours and volumetric measurements are produced within seconds, with physician review completed in 2-3 minutes.

Importance of the Study

Low-grade (WHO grade 2) diffuse gliomas (LGG) infiltrate the brains of young, otherwise productive adults and cause substantial neurological morbidity. Because volumetric tumor measurements have historically been time-consuming and labor-intensive, current clinical practice largely relies on qualitative visual assessment by doctors. To address this, the National Cancer Institute supported the development of an FDA-cleared AI-enabled online platform that rapidly measures tumor volume and provides physician-reviewed results within minutes. In our study, combining AI with physician oversight allowed tumor growth to be detected much earlier—on average nearly 2 years sooner—than with standard visual assessment alone. In several cases, the AI-assisted approach identified progression that had been previously considered stable. While the AI alone performs better than standard care, the most accurate results are achieved when physicians review the AI output. This technology could help doctors recognize tumor changes sooner and make more informed decisions about patient care.

Low-grade gliomas (LGGs) represent a significant challenge due to their subtle growth patterns and the difficulty of early and accurate detection of tumor progression. LGG account for approximately 15% of all gliomas, with an incidence rate of about 1 in 100,000. These tumors are most observed in adults in their 30s and 40s.¹ LGGs, classified as WHO grade 2 gliomas, constitute a considerable proportion of adult brain tumors and are known for their brain invasion and potential for malignant transformation; the median life expectancy of an IDHmt astrocytoma grade 2 is between 10-15 years, and for an oligodendroglioma slightly longer.¹ The predominant majority of LGG carry mutations in the IDH gene, which would make them potentially susceptible to clinically proven IDH inhibitors.² The incidence of LGGs transforming into higher-grade tumors has been reported to be as high as 72%.³ The prognosis of grade 2 IDH-mutant astrocytomas is only slightly better than grade 3, with median survival of 11 years compared to 9 years.³^,⁴

The current standard of care for treating LGG includes a multidisciplinary approach tailored to individual patient factors, such as tumor location, genetic markers, and patient age. Surgical resection is typically the initial treatment, with the goal of maximal safe removal of the tumor to alleviate symptoms and obtain tissue for histopathological and molecular analysis.⁵ Postoperative management protocols are guided by molecular markers such as IDH mutation and 1p/19q codeletion status, which have prognostic significance. For high-risk patients (eg, subtotal resection, or unfavorable molecular profile) adjuvant therapies are considered. Post-radiation chemotherapy is often recommended to improve progression-free and overall survival.⁶ Regardless of the treatment protocol and molecular profile, the clinical management of LGGs typically relies on regular monitoring by longitudinal magnetic resonance imaging (MRI) to detect progression. Traditionally, visual inspection by radiologists has been the clinical gold standard for assessing tumor growth over time. However, this method is inherently subjective and prone to inter-observer variability, often resulting in delayed detection of growth, which can impact patient outcomes.⁷

Traditionally, longitudinal studies of LGGs have been evaluated using visual inspection or 2D diameter measurements, which have been the standard technique for assessment according to the Response Assessment in Neuro-Oncology (RANO) criteria.⁸ So far, volumetric measurements have not been widely used because manual contouring methods are time-consuming and prone to high inter-user variability. Recent advances in artificial intelligence bring timely and accurate volumetric measurements within reach.⁷^,⁹ There has been a gradual, though slow, shift from 2D measurements to 3D/volumetric assessments of LGGs. Additionally, the most recent version of the RANO criteria, RANO 2.0 (introduced in 2023), now incorporates 2D, 3D, and volumetric measurements. It also allows for AI-based segmentation, with human supervision during the process.¹⁰

Here, we evaluate the efficacy of volumetric analysis using the MRIMath platform, including a FLAIR AI and human supervision using the Smart contouring system, in detecting tumor growth in LGG patients as compared to standard visual inspection. We study AI alone and AI combined with human review and approval. Previous studies have shown the benefits of volumetric analysis combined with the online change-of-point statistical method in detecting tumor growth significantly earlier than visual inspection.¹¹ Fathallah-Shaykh et al. applied the computationally intensive method of non-negative matrix factorization to segment LGG¹²; here we use an accurate and efficient AI that generates segmentations in seconds. We also evaluate the role of physician review in enhancing AI prediction accuracy. We hypothesize that AI with volumetric assessment allows for earlier detection of tumor growth than current clinical standard of radiological assessment. For volumetric assessment, we apply the statistical change-of-point method to determine the first point of growth. The change-of-point method applies the same rigorous statistical standard to all patients and studies and determines if a current measurement is significantly different from all the measurements of the same patient.⁷^,⁹

Methods

Ethical Approval

The Institutional Review Board of the University of Alabama at Birmingham approved the research (IRB-150618007); waiver of informed consent was granted because the research involved no greater than minimal risk and no procedures for which written consent is normally required outside the research context.

Study Design and Patient Selection

The patients were seen at the neuro-oncology clinics of the University of Alabama at Birmingham between July 1, 2017, and May 14, 2018. The inclusion criteria were (1) pathological diagnosis of grade 2 oligodendroglioma (oligo), grade 2 astrocytoma (astro), or grade 2 mixed glioma (oligostarocytoma) in the brain excluding the pineal gland and (2) either no postoperative treatments or chemotherapy with temozolomide, (3) at least 4 MRI scans available for review either after the initial diagnosis or after the completion of chemotherapy with temozolomide (if applicable). We include mixed gliomas because of the retrospective nature of this study. The exclusion criteria were (1) treatment with radiation therapy after the initial diagnosis or (2) radiological reports indicating development of new enhancement without an increase in FLAIR signal. Patients treated by radiation therapy were excluded because radiation may confound the results by causing an independent increase in FLAIR signal. We excluded patients whose radiological reports described new enhancing nodules without an increase in FLAIR signal because they are easily detected by visual examination.

A total of 56 gliomas met the inclusion criteria, including 22 oligodendrogliomas, 30 astrocytomas, and 4 mixed gliomas; only 2 patients received temozolomide. All of the oligos had the 1p/19q co-deletions except for 1 with a single deletion of 19q. The pathological reports for the 4 gliomas classified as mixed lacked molecular data necessary for reclassification. At the time of retrospective review, 34/56 patients had been diagnosed with clinical progression (ie, clinical progression group) while the remaining 22/56 were diagnosed as being clinically stable (clinically stable group) by visual comparison of the most recent MRI performed at the last clinic visit. We reviewed the records of 8 patients monitored for imaging abnormalities without a pathological diagnosis. One patient was excluded due to insufficient follow-up data. After appropriate follow-up with a mean of 6 years, clinicians determined that the imaging abnormalities in the remaining 7 patients were unlikely to represent gliomas. Their ages range from 45 to 68 with a mean of 59 years. To ensure methodological rigor, these 7 cases were included as negative controls in the application of the change-point method to measured volumes. As no surgical intervention was performed, a pathological diagnosis could not be established. The most likely diagnosis was demyelination; these lesions were not attributable to vascular malformations or postoperative changes. The same AI pipeline was applied to the 7 patients in the stable FLAIR (negative control) group.

The selection criteria required at least 4 MRI scans, excluding those who received radiation therapy post-diagnosis, to ensure consistency in the study population. The final dataset included 63 patients divided into 3 groups: clinical progression (n = 34), clinically stable (n = 22), and imaging abnormality (n = 7).

Time to Growth Detected by Standard Clinical Care

Both the radiologist and neuro-oncologist had access to the pathological findings through the medical records. Radiological reports were generated by various board-certified neuroradiologists at the University of Alabama at Birmingham Hospital after reviewing each longitudinal MRI scan. These reports did not specify whether the current scans were compared with postoperative MRIs and the radiologists did not include any 2D measurements in their reports. Neuro-oncology notes were also reviewed retrospectively. Clinical progression was defined by a radiological report indicating tumor growth in conjunction with a neuro-oncology note documenting communication of this change to the patient or family. The time to growth detection was retrospectively calculated based on the impressions recorded in the radiological reports. Because of the retrospective nature of this study, the radiologists and neuro-oncologists were blinded to the AI outputs.

The MRIMath Cyber Device

The MRIMath i²Contour cyber device has received 510k approval from the Food and Drug administration (www.mrimath.com).¹³^,¹⁴ The platform establishes a secure connection with a picture archiving and communication system or folder located behind institutional firewall for automatic transfer of a study in seconds. Users may: (1) login by 2-factor authentication or single sign-on from any location using a chrome browser, (2) upload data, (3) view dicom files, (4) order AI segmentation for GBM T1c or FLAIR sequences and obtain results in seconds, (5) review and make changes using the MRIMath low-variability Smart contouring system,⁹ (6) view volume plots of longitudinal studies, and (7) download both volumetric data and segmentations.

MRI scans were uploaded to the MRIMath platform and segmented by the GBM FLAIR AI, trained on both pre- and post-operative segmented FLAIR images; the mean overall Dice-Sørensen coefficient (DICE) score of the AI as compared to a common ground truth derived from 3 independent neuro-radiologists was 92%⁹; a description of the training of the MRIMath AI and how it compares to other segmentation systems is detailed at.⁹ We have used Version 1.0. The input of the AI is one slice at a time. While the MRIMath AI models are primarily trained on 2D slices, they include post-processing steps that incorporate 3D spatial continuity and contextual information. The parameters of the trained AI model are fixed.

The MRIMath platform includes a Smart Contouring system that enables physicians to review, revise, and approve the AI-generated segmentations. MRIMath© adopts a streamlined approach by handling a single series type (FLAIR or T1c), which contrasts with the modality integration and subcomponent segmentations used by the other platforms.¹⁵^,¹⁶

Tumor Segmentation, Physician Review, and Volume Calculations

A total of 653 MRIs were uploaded to the MRIMath platform (the dicom headers are summarized in Supplementary Information). Segmentation results were reviewed by 98 physicians, including neuroradiologists (n = 21), neuro-oncologists (n = 14), radiation oncologists (n = 60), and neurosurgeons (n = 3). The reviewers included 31 physicians in private practice, 16 residents, 5 fellows, 24 assistant professors, 13 associate professors, and 9 professors. Physician reviews served as the gold standard for calculating the average dice score for evaluating the accuracy of the AI-generated manual segmentations. Tumor volumes were calculated from the segmented tumors. Inter-user variability was measured from the cases with more than one reviewer by calculating the average Dice score by taking one reviewer as the gold standard. Intra-user variability was measured by computing the average Dice score between 13 MRIs approved by 2 physicians 18 months apart.

Online Change-of-Point Detection

To detect the first significant change in a series of longitudinal volumes, we apply the online change-of-point method to the AI-generated measurements with or without human review. To exclude FLAIR changes due to the evolution of post-surgical changes, the baseline volume in the longitudinal series was the first minimum after surgical resection. To identify an abrupt change of volume, we applied a change in the root-mean-square level at a minimum threshold of 500/(volume at baseline) and a minimum of 2 samples between change points. The number 500 corresponds to 5% of the rounded median of the baseline volume. The change-point MATLAB (Natick, MA) function is provided in the Supplementary Material.

The change-of-point is a statistical technique used to identify points in a time series where the data’s properties, such as mean, variance, or distribution, change significantly.¹⁷^,¹⁸ These points, known as change points, indicate shifts in the underlying process generating the data. Detecting such changes is crucial in patient follow up and can help the physician to detect subtle changes of MR images. There are 2 primary approaches to change point detection: the online method detects changes in points in real-time as new data becomes available, making it essential for applications requiring immediate detection and response. The offline method analyzes a complete dataset to identify any change points that have occurred. It’s often used in post hoc analyses where the entire data sequence is available.¹⁹ Here, we use the algorithms developed by Rebecca Killick and apply the online method, as it replicates clinical scenarios where the volumetric series includes only measurements taken before a specific date.²⁰

Results

Review Time, Dice Scores, and Variability

The median time spent reviewing, revising and approving the AI segmentations was 1.18 minutes (lower quartile of 1.18 minutes and an upper quartile of 3.31), mean = 2.93 minutes, 95% CI: (2.32, 3.54). The average DICE score of the FLAIR AI predictions, as compared to the golden truths segmentations approved by physicians, is 87.7. Fifty-one cases were reviewed by more than one physician; the average inter-user DICE score is 87.5, corresponding to a variability of 12.5. The average intra-user DICE score is 93.1, which corresponds to a variability of 6.9%.

Clinical Progression Group

The clinical progression group includes the LGG patients whose last MRIs were interpreted as showing progression by the radiological reports. The results demonstrate that volumetric analysis calculated by AI segmentation followed by human review detects tumor growth significantly earlier than traditional visual inspection by radiologists (median of 21 months, Table 1, and Figure 1). Conversely, AI without human review also detects tumor growth earlier than visual inspection (median of 21 months), but with false negative and positive results (Table 1); it missed 3 out of 35 cases, and these misses were primarily associated with smaller tumors that were more challenging to detect and segment.

Table 1.

Detecting tumor growth in the Clinical Progression Group earlier than visual inspection by radiologists

Measure	AI without human review	AI with human review
Median (months)	21	21
Lower interquartile range (IQR)	11.75	9
Upper IQR	38.75	31.5

Open in a new tab

Figure 1. — Volumetric analysis combined with the online change-of-point statistical method detects progression on 30 July 2012, 1 year earlier than the date of progression documented in the clinical notes (15 July 2013). Columns A-E correspond to sections within the same MRI series. Notice that though the tumor in the 2D sections containing the largest tumors under columns (C) and (D) did not change between the baseline (25 July 2011) and 30 July 2012, the 2D sections in columns B and E increased in size between the baseline MRI of 07/25/2011 and the date of progression detected by the change-of-point method, 30 July 3012 (white arrows). (F) The online change-of-point analysis detects progression at the red Asterix. The clinical notes detected progression 1 year later.

Clinically Stable Group

The clinically stable group includes patients whose last MRI was deemed stable by visual inspection. Volumetric analysis aided by the AI and reviewed by a human, detected tumor growth in 13 out of 22 cases (Table 2 and Figure 2). Remarkably, this detection occurred at a median of 23 months earlier than the most recent MRI scan. In comparison, AI without human review detected growth at a median of 26 months earlier but missed 1 out of the 13 cases. Moreover, while AI with human review accurately flagged the remaining 9 cases as stable, AI without review incorrectly detected tumor growth in 1 of those 9 cases.

Table 2.

Detecting growth in the Clinically-Stable Group earlier than the last MRI

Measure	AI without human review	AI with human review
Median (months)	26	23
Lower IQR	11.75	9
Upper IQR	67.5	67

Open in a new tab

Figure 2. — Volumetric analysis combined with the online change-of-point method detected progression on 19 December 2016 (see arrows) as compared to baseline on 13 June 2016. The columns in (A) represent sequential sections from the same MRI series. The MRI on 11 December 2017 was read as stable; arrows point to tumor growth as compared to the MRI of 19 December 2016. (B) The online change-of-point analysis detects progression at the red Asterix. The clinical notes considered all these MRIs as stable.

Negative Control Group

The negative control group consisted of FLAIR lesions monitored without a pathological cancer diagnosis. The performance of the volumetric analysis using the change-point method varied significantly depending on whether human review was involved. When used in conjunction with physician review, the system correctly identified all 7 cases as stable, demonstrating perfect specificity. In contrast, without human oversight, 3 out of 7 cases were incorrectly flagged as tumor growth.

False Negative and False Positive Rates

False negative rates

By combining the 35 cases from the clinical progression group and the 13 cases from the clinically stable group that showed progression after computer-assisted diagnosis, the sensitivity the AI without human review missed 4 of these 48 cases (8.33%, Table 3). The variability in the volume measurements caused the change-point method to identify statistically significant changes at different time points.

Table 3.

False positive and negative rates (per patient)

Method	FP rate	FN rate
AI without human review	25% (4/16)	8.33% (4/48)

Open in a new tab

False positive rates

By combining the 7 cases from the negative control group and 9 stable cases from the clinically stable group, we find that the AI without human review incorrectly flagged a total of 4/16 cases as tumor growth (Table 3).

Discussion

We find that human-supervised, AI-supported measurements of longitudinal LGG volumes detect tumor growth significantly earlier than radiologists’ interpretations based solely on visual inspection. Furthermore, AI-generated segmentations—without human review—also identify progression dates earlier than visual inspection, although this comes at the cost of false positives and negatives, underscoring the importance of physician oversight.

We segment the LGG using the MRIMath© FLAIR AI, which is FDA-approved for glioblastoma¹³^,¹⁴; the “golden truth” is established by Board-certified physicians who review the AI-generated segmentations using the FDA-approved MRIMath© Smart Contouring platform. Our design, involving multiple reviewers from different subspecialties, mirrors real-world clinical scenarios. For volumetric assessment, we employed the statistical online change-of-point method to determine the first point of growth.⁷ The change-of-point method applies a consistent and rigorous statistical standard across all patients and studies. Unlike the conventional RANO product rule, which uses a universal approach based on the product of diameters, the change-of-point method assesses whether a current measurement significantly deviates from all previous measurements for the same patient. The variability between the AI-generated predictions and physician-approved contours (DICE score: 87.7) is comparable to the variability observed between different users (DICE score: 87.5). Additionally, the average intra-user DICE score is 93.1. Notably, when the same AI model is applied to glioblastoma FLAIR images, it achieves a similar DICE score of 92.⁹ Together, these findings highlight both the inherent difficulty of precisely delineating FLAIR lesion boundaries and the high accuracy of the AI system in segmenting LGG. In addition, this AI framework can be utilized to track FLAIR changes in post-radiation patients by identifying and quantifying the typical range of treatment-related volumetric alterations.

Most clinical radiologists use visual inspection to evaluate longitudinal studies, while the RANO criteria remain the current standard for detecting progression in clinical trial settings.²¹ Significant limitations of the RANO criteria include large inter-user variability in determining the perpendicular diameters of the largest tumor cross-section and the reliance on 2-dimensional measurements which may not accurately represent the complex 3-dimensional tumor architecture.¹¹^,²² Volumetric analysis was also shown to be superior to other established tumor classification criteria, such as the Response Evaluation Criteria in Solid Tumors (RECIST) and modified RECIST,²³ due to their tendency to oversimplify a multidimensional and heterogeneous tumor.²⁴

The goal of this study is to measure and analyze LGG volumes longitudinally. FLAIR volumes are typically larger than T2 and there is a mismatch between FLAIR and T2 volumes in LGG.²⁵^,²⁶ We have chosen FLAIR because we believe it reflects that most accurate volume of brain infiltration. None of the tumors showed enhancement on T1c. Volumetric measurement remains the gold standard for assessing the growth of LGGs and detecting progression. In particular, tumor volumes and growth rates serve as key prognostic factors.²⁷ One limitation of the present study is the lack of a parallel neuroradiologist-led review of serial MRI examinations according to RANO criteria for determining disease progression. Several studies have reported improved diagnostic accuracy of volumetric analysis as compared to the RANO criteria and radiologists’ interpretations of longitudinal LGG imaging. Fabio et al.report that volumetric measurements detect progression at significantly earlier time points than the RANO criteria⁷^,¹¹; as compared to volumetric ground truth, the accuracy of RANO compared to previous and baseline scans was 21.0% and 56.5%, with an AUC of 0.39 and 0.55, respectively. A recent analysis of the Phase I trial of ivosidenib for LGG also confirmed that 3D volumetric measurements outperform 2D measurements in response assessment with higher inter-reader agreement and lower rates of reader discordance.²⁸ The findings of 2 studies on pediatric gliomas support the conclusion that volumetric analysis is more effective than 2D measurements in diagnosing tumor progression at significantly earlier time points.²⁹^,³⁰ In a study of 6 patients with pediatric glioma, Khalili et al. reported that volumetric analysis detected progression earlier than radiologists’ interpretations.³⁰ Similarly, in a study of recurrent gliomas, Dempsey et al. found that only volumetric measurements of tumor size were predictive of survival, unlike 1D, 2D, and 3D measurements.³¹ One study compared one-dimensional (1D), two-dimensional (2D), and three-dimensional (3D) linear measurements with manual volume measurements in the follow-up of LGG. The authors concluded that 3D “linear” measurement of LGGs is superior to 1D and 2D methods, aligning well with tumor volume calculations, although it is not without limitations.³²

Several methods are available for automatically assessing the volumes of gliomas. Compared to these methods, the MRIMath FLAIR AI stands out for being fully automated, as it does not require preprocessing steps that involve human supervision, such as deboning, interpolation, or registration. Additionally, while MRIMath© processes images in 2D and treats the FLAIR modality independently, other methods use a 3D approach and integrate the 4 modalities: T1, T1c, T2, and FLAIR.¹⁸^,³³ The MRIMath© FLAIR series AI differs from the subcomponent segmentations employed by other platforms. This study focuses on glioma patients who have not received radiation therapy; nonetheless, our methodology is also valuable in patients who had received radiation, in whom white matter changes on FLAIR could be treatment related.

Limitations of our study include a retrospective design, a small dataset from a single institution, using multiple reviewing physicians from different specialties, primarily using FLAIR sequences, and a comparison to visual inspection. Our goal is to evaluate the importance of AI volumetric analysis in real world scenarios where multiple physicians evaluate longitudinal LGG images.

Conclusions

Our study highlights the effectiveness of AI-assisted volumetric analysis using the MRIMath FLAIR AI platform for detecting tumor growth in LGGs. By integrating AI with physician review, we were able to detect tumor progression at significantly earlier time points compared to traditional visual inspection methods. This study underscores the potential of AI in clinical oncology, particularly in enhancing the early detection of tumor growth, while also emphasizing the importance of human review in conjunction with a low-variability platform.

Supplementary Material

vdaf271_Supplementary_Data

vdaf271_supplementary_data.docx^{(307.8KB, docx)}

Contributor Information

Hassan M Fathallah-Shaykh, Department of Neurology, The University of Alabama at Birmingham, Birmingham, AL (H.M.F.-S.).

Houman Sotoudeh, Department of Radiology, University of Texas Southwestern, Dallas, TX.

Markus Bredel, Department of Radiation Oncology, University of Miami, Miami, FL.

Alex Whitley, Central Alabama Radiation Oncology, Montgomery, AL.

Jinsuh Kim, Department of Radiology and Imaging Sciences, Emory University, Atlanta, GA.

Fanny E Morón, Department of Radiology, Baylor College of Medicine, Houston, TX (F.E.M.).

Fabio Raman, Department of Radiology, Johns Hopkins School of Medicine, Baltimore, MD.

Nidhal Bouaynaya, Department of Computer and Electrical Engineering, Rowans University, Glassboro, NJ.

Hayat Rahal, MRIMath, Birmingham, AL.

Supplementary Material

Supplementary material is available online at Neuro-Oncology Advances (https://academic.oup.com/noa).

Lay Summary

Low-grade (WHO grade 2) diffuse gliomas infiltrate the brains of young, otherwise productive adults and cause substantial neurological morbidity. Because volumetric tumor measurements have historically been time-consuming and labor-intensive, current clinical practice largely relies on qualitative visual assessment by doctors. To address this, the National Cancer Institute supported the development of an FDA-cleared AI-enabled online platform that rapidly measures tumor volume and provides physician-reviewed results within minutes. In our study, combining AI with physician oversight allowed tumor growth to be detected much earlier—on average nearly two years sooner—than with standard visual assessment alone. In several cases, the AI-assisted approach identified progression that had been previously considered stable. While the AI alone performs better than standard care, the most accurate results are achieved when physicians review the AI output. This technology could help doctors recognize tumor changes sooner and make more informed decisions about patient care.

Author Contributions

Conceptualization: H.M.F.S., H.S., and N.B.; Methodology: N.B., H.R., H.S., and H.M.F.S.; Software, Validation: M.B., A.W., J.K., F.E.M., F.R., H.R., and H.M.F.S.; Formal analysis: H.M.F.S., H.S., M.B., and N.B.; Investigation: H.M.F.S., H.S., M.B., and N.B.; Resources: H.R., H.M.F.S., N.B., H.S.; Data curation: H.M.F.S., N.B., and H.S.; Writing—original draft preparation: H.M.F.S., H.S., and N.B.; Writing—review & editing: H.S., M.B., A.W., H.R., J.K., F.E.M., F.R., N.B., and H.M.F.S.; Visualization: H.S., M.B., A.W., J.K., F.E.M., F.R., N.B., and H.M.F.S.; Supervision: H.S., N.B., and H.M.F.S.; Project administration: H.S., N.B., H.R., and H.M.F.S.; Funding acquisition: H.M.F.S., H.R. and N.B. All authors have read and agreed to the published version of the manuscript.

Conflict of Interest Statement

H.R. is employed by MRIMath. H.M.F.S. and N.B. are co-founders of MRIMath.

Funding

This research was funded by the National Cancer Institute, Phase II SBIR contract number 75N91022C00051.

Institutional Review Board Statement

The institutional Review Board of the University of Alabama at Birmingham approved the research (IRB-150618007).

Informed Consent Statement

Waiver of informed consent was granted because the research involved no greater than minimal risk and no procedures for which written consent is normally required outside the research context.

Data Availability

The data that support the findings of this study are available from the corresponding authors upon reasonable request. Restrictions apply to the clinical imaging data used in this study to protect patient privacy, and therefore these data cannot be publicly shared.

References

1. Rimmer B, Bolnykh I, Dutton L, et al. Health-related quality of life in adults with low-grade gliomas: a systematic review. Qual Life Res. 2023;32:625-651. [DOI] [PMC free article] [PubMed] [Google Scholar]
2. Mellinghoff IK, van den Bent MJ, Blumenthal DT, INDIGO Trial Investigators, et al. Vorasidenib in IDH1- or IDH2-mutant low-grade glioma. N Engl J Med. 2023;389:589-601. [DOI] [PMC free article] [PubMed] [Google Scholar]
3. Yogendran L, Rudolf M, Yeannakis D, Fuchs K, Schiff D. Navigating disability insurance in the American healthcare system for the low-grade glioma patient. Neurooncol Pract. 2023;10:5-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
4. Reuss DE, Mamatjan Y, Schrimpf D, et al. IDH mutant diffuse and anaplastic astrocytomas have similar age at presentation and little difference in survival: a grading problem for WHO. Acta Neuropathol. 2015;129:867-873. [DOI] [PMC free article] [PubMed] [Google Scholar]
5. Recht LD. Patient education: low-grade glioma in adults (Beyond the Basics). UpToDate. Retrieved January 4, 2024 from https://www.uptodate.com/contents/low-grade-glioma-in-adults-beyond-the-basics, 2024.
6. Nabors LB, Portnow J, Ahluwalia M, et al. Central nervous system cancers, version 3.2020, NCCN clinical practice guidelines in oncology. J Natl Compr Canc Netw. 2020;18:1537-1570. [DOI] [PubMed] [Google Scholar]
7. Fathallah-Shaykh HM, DeAtkine A, Coffee E, et al. Diagnosing growth in low-grade gliomas with and without longitudinal volume measurements: a retrospective observational study. PLoS Med. 2019;16:e1002810. [DOI] [PMC free article] [PubMed] [Google Scholar]
8. van den Bent MJ, Wefel JS, Schiff D, et al. Response Assessment in Neuro-Oncology (a report of the RANO group): assessment of outcome in trials of diffuse low-grade gliomas. Lancet Oncol. 2011;12:583-593. [DOI] [PubMed] [Google Scholar]
9. Barhoumi Y, Fattah AH, Bouaynaya N, et al. Robust AI-driven segmentation of glioblastoma T1c and FLAIR MRI series and the low variability of the MriMath© smart manual contouring platform. Diagnostics. 2024;14(11):1066. [DOI] [PMC free article] [PubMed] [Google Scholar]
10. Ellingson BM, Sanvito F, Cloughesy TF, et al. A neuroradiologist’s guide to operationalizing the Response Assessment in Neuro-Oncology (RANO) criteria version 2.0 for gliomas in adults. AJNR Am J Neuroradiol. 2024;45:1846-1856. [DOI] [PMC free article] [PubMed] [Google Scholar]
11. Raman F, Mullen A, Byrd M, et al. Evaluation of RANO criteria for the assessment of tumor progression for lower-grade gliomas. Cancers (Basel). 2023;15(3): 3274. [DOI] [PMC free article] [PubMed] [Google Scholar]
12. Dera D, Bouaynaya N, Fathallah-Shaykh HM. Automated robust image segmentation: level set method using nonnegative matrix factorization with application to brain MRI. Bull Math Biol. 2016;78:1450-1476. [DOI] [PubMed] [Google Scholar]
13. MRIMath LLC, a Medical AI Company, Announces FDA 510(k) Clearance for its Cyber Device and AI Solution for Glioblastoma. Businesswire. https://www.businesswire.com/news/home/20240820135568/en/MRIMath-LLC-a-Medical-AI-Company-Announces-FDA-510-k-Clearance-for-its-Cyber-Device-and-AI-Solution-for-Glioblastoma, 2024.
14. i2Contour. U.S. Food and Drug Administration. https://www.accessdata.fda.gov/cdrh_docs/pdf23/K233822.pdf, 2024, Access Date: 01/09/2026.
15. Cepeda S, Romero R, Luque L, et al. Deep learning-based postoperative glioblastoma segmentation and extent of resection evaluation: development, external validation, and model comparison. Neurooncol Adv. 2024;6:vdae199. [DOI] [PMC free article] [PubMed] [Google Scholar]
16. Menze BH, Jakab A, Bauer S, et al. The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans Med Imaging. 2015;34:1993-2024. [DOI] [PMC free article] [PubMed] [Google Scholar]
17. Aminikhanghahi S, Cook DJ. A survey of methods for time series change point detection. Knowl Inf Syst. 2017;51:339-367. [DOI] [PMC free article] [PubMed] [Google Scholar]
18. Abayazeed AH, Abbassy A, Mueller M, et al. NS-HGlio: a generalizable and repeatable HGG segmentation and volumetric measurement AI algorithm for the longitudinal MRI assessment to inform RANO in trials and clinics. Neurooncol Adv. 2023;5:vdac184. [DOI] [PMC free article] [PubMed] [Google Scholar]
19. Truong COL Vayatis N. Selective review of offline change point detection methods. Signal Processing. 2020;167:107299. [Google Scholar]
20. Killick R. Methods for Changepoint Detection. Vol https://github.com/rkillick/changepoint/. Access Date 01/09/2026.
21. Jakola AS, Reinertsen I. Radiological evaluation of low-grade glioma: time to embrace quantitative data? Acta Neurochir (Wien). 2019;161:577-578. [DOI] [PubMed] [Google Scholar]
22. Ramakrishnan D, von Reppert M, Krycia M, et al. Evolution and implementation of radiographic response criteria in neuro-oncology. Neurooncol Adv. 2023;5:vdad118. [DOI] [PMC free article] [PubMed] [Google Scholar]
23. Hajkova M, Andrasina T, Ovesna P, et al. Volumetric analysis of hepatocellular carcinoma after transarterial chemoembolization and its impact on overall survival. In Vivo. 2022;36:2332-2341. [DOI] [PMC free article] [PubMed] [Google Scholar]
24. Eisenhauer EA, Therasse P, Bogaerts J, et al. New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1). Eur J Cancer. 2009;45:228-247. [DOI] [PubMed] [Google Scholar]
25. Yildirim O, Young RJ. Advanced imaging of IDH-mutant gliomas: precision in diagnosis and management. Adv Cancer Res. 2025;166:59-80. [DOI] [PubMed] [Google Scholar]
26. Stall B, Zach L, Ning H, et al. Comparison of T2 and FLAIR imaging for target delineation in high grade gliomas. Radiat Oncol. 2010;5:5. [DOI] [PMC free article] [PubMed] [Google Scholar]
27. Gui C, Lau JC, Kosteniuk SE, Lee DH, Megyesi JF. Radiology reporting of low-grade glioma growth underestimates tumor expansion. Acta Neurochir (Wien). 2019;161:569-576. [DOI] [PubMed] [Google Scholar]
28. Ellingson BM, Kim GHJ, Brown M, et al. Volumetric measurements are preferred in the evaluation of mutant IDH inhibition in non-enhancing diffuse gliomas: evidence from a phase I trial of ivosidenib. Neuro Oncol. 2022;24:770-778. [DOI] [PMC free article] [PubMed] [Google Scholar]
29. von Reppert M, Ramakrishnan D, Bruningk SC, et al. Comparison of volumetric and 2D-based response methods in the PNOC-001 pediatric low-grade glioma clinical trial. Neurooncol Adv. 2024;6:vdad172. [DOI] [PMC free article] [PubMed] [Google Scholar]
30. Khalili NK, Bagheri S, Familiar A, et al. A. Volumetric measurement of tumor size outperforms standard two-dimensional method in early prediction of tumor progression in pediatric glioma. Neuro Oncol. 2023;25:i48. A.F [Google Scholar]
31. Dempsey MF, Condon BR, Hadley DM. Measurement of tumor "size" in recurrent malignant glioma: 1D, 2D, or 3D? AJNR Am J Neuroradiol. 2005;26:770-776. [PMC free article] [PubMed] [Google Scholar]
32. Dos Santos T, Deverdun J, Chaptal T, et al. Diffuse low-grade glioma: what is the optimal linear measure to assess tumor growth? Neurooncol Adv. 2024;6:vdae044. [DOI] [PMC free article] [PubMed] [Google Scholar]
33. Zhang Y, Zhong P, Jie D, et al. Brain tumor segmentation from multi-modal MR images via ensembling UNets. Front Radiol. 2021;1:704888. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

vdaf271_Supplementary_Data

vdaf271_supplementary_data.docx^{(307.8KB, docx)}

Data Availability Statement

[vdaf271-B1] 1. Rimmer B, Bolnykh I, Dutton L, et al. Health-related quality of life in adults with low-grade gliomas: a systematic review. Qual Life Res. 2023;32:625-651. [DOI] [PMC free article] [PubMed] [Google Scholar]

[vdaf271-B2] 2. Mellinghoff IK, van den Bent MJ, Blumenthal DT, INDIGO Trial Investigators, et al. Vorasidenib in IDH1- or IDH2-mutant low-grade glioma. N Engl J Med. 2023;389:589-601. [DOI] [PMC free article] [PubMed] [Google Scholar]

[vdaf271-B3] 3. Yogendran L, Rudolf M, Yeannakis D, Fuchs K, Schiff D. Navigating disability insurance in the American healthcare system for the low-grade glioma patient. Neurooncol Pract. 2023;10:5-12. [DOI] [PMC free article] [PubMed] [Google Scholar]

[vdaf271-B4] 4. Reuss DE, Mamatjan Y, Schrimpf D, et al. IDH mutant diffuse and anaplastic astrocytomas have similar age at presentation and little difference in survival: a grading problem for WHO. Acta Neuropathol. 2015;129:867-873. [DOI] [PMC free article] [PubMed] [Google Scholar]

[vdaf271-B5] 5. Recht LD. Patient education: low-grade glioma in adults (Beyond the Basics). UpToDate. Retrieved January 4, 2024 from https://www.uptodate.com/contents/low-grade-glioma-in-adults-beyond-the-basics, 2024.

[vdaf271-B6] 6. Nabors LB, Portnow J, Ahluwalia M, et al. Central nervous system cancers, version 3.2020, NCCN clinical practice guidelines in oncology. J Natl Compr Canc Netw. 2020;18:1537-1570. [DOI] [PubMed] [Google Scholar]

[vdaf271-B7] 7. Fathallah-Shaykh HM, DeAtkine A, Coffee E, et al. Diagnosing growth in low-grade gliomas with and without longitudinal volume measurements: a retrospective observational study. PLoS Med. 2019;16:e1002810. [DOI] [PMC free article] [PubMed] [Google Scholar]

[vdaf271-B8] 8. van den Bent MJ, Wefel JS, Schiff D, et al. Response Assessment in Neuro-Oncology (a report of the RANO group): assessment of outcome in trials of diffuse low-grade gliomas. Lancet Oncol. 2011;12:583-593. [DOI] [PubMed] [Google Scholar]

[vdaf271-B9] 9. Barhoumi Y, Fattah AH, Bouaynaya N, et al. Robust AI-driven segmentation of glioblastoma T1c and FLAIR MRI series and the low variability of the MriMath© smart manual contouring platform. Diagnostics. 2024;14(11):1066. [DOI] [PMC free article] [PubMed] [Google Scholar]

[vdaf271-B10] 10. Ellingson BM, Sanvito F, Cloughesy TF, et al. A neuroradiologist’s guide to operationalizing the Response Assessment in Neuro-Oncology (RANO) criteria version 2.0 for gliomas in adults. AJNR Am J Neuroradiol. 2024;45:1846-1856. [DOI] [PMC free article] [PubMed] [Google Scholar]

[vdaf271-B11] 11. Raman F, Mullen A, Byrd M, et al. Evaluation of RANO criteria for the assessment of tumor progression for lower-grade gliomas. Cancers (Basel). 2023;15(3): 3274. [DOI] [PMC free article] [PubMed] [Google Scholar]

[vdaf271-B12] 12. Dera D, Bouaynaya N, Fathallah-Shaykh HM. Automated robust image segmentation: level set method using nonnegative matrix factorization with application to brain MRI. Bull Math Biol. 2016;78:1450-1476. [DOI] [PubMed] [Google Scholar]

[vdaf271-B13] 13. MRIMath LLC, a Medical AI Company, Announces FDA 510(k) Clearance for its Cyber Device and AI Solution for Glioblastoma. Businesswire. https://www.businesswire.com/news/home/20240820135568/en/MRIMath-LLC-a-Medical-AI-Company-Announces-FDA-510-k-Clearance-for-its-Cyber-Device-and-AI-Solution-for-Glioblastoma, 2024.

[vdaf271-B14] 14. i2Contour. U.S. Food and Drug Administration. https://www.accessdata.fda.gov/cdrh_docs/pdf23/K233822.pdf, 2024, Access Date: 01/09/2026.

[vdaf271-B15] 15. Cepeda S, Romero R, Luque L, et al. Deep learning-based postoperative glioblastoma segmentation and extent of resection evaluation: development, external validation, and model comparison. Neurooncol Adv. 2024;6:vdae199. [DOI] [PMC free article] [PubMed] [Google Scholar]

[vdaf271-B16] 16. Menze BH, Jakab A, Bauer S, et al. The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans Med Imaging. 2015;34:1993-2024. [DOI] [PMC free article] [PubMed] [Google Scholar]

[vdaf271-B17] 17. Aminikhanghahi S, Cook DJ. A survey of methods for time series change point detection. Knowl Inf Syst. 2017;51:339-367. [DOI] [PMC free article] [PubMed] [Google Scholar]

[vdaf271-B18] 18. Abayazeed AH, Abbassy A, Mueller M, et al. NS-HGlio: a generalizable and repeatable HGG segmentation and volumetric measurement AI algorithm for the longitudinal MRI assessment to inform RANO in trials and clinics. Neurooncol Adv. 2023;5:vdac184. [DOI] [PMC free article] [PubMed] [Google Scholar]

[vdaf271-B19] 19. Truong COL Vayatis N. Selective review of offline change point detection methods. Signal Processing. 2020;167:107299. [Google Scholar]

[vdaf271-B20] 20. Killick R. Methods for Changepoint Detection. Vol https://github.com/rkillick/changepoint/. Access Date 01/09/2026.

[vdaf271-B21] 21. Jakola AS, Reinertsen I. Radiological evaluation of low-grade glioma: time to embrace quantitative data? Acta Neurochir (Wien). 2019;161:577-578. [DOI] [PubMed] [Google Scholar]

[vdaf271-B22] 22. Ramakrishnan D, von Reppert M, Krycia M, et al. Evolution and implementation of radiographic response criteria in neuro-oncology. Neurooncol Adv. 2023;5:vdad118. [DOI] [PMC free article] [PubMed] [Google Scholar]

[vdaf271-B23] 23. Hajkova M, Andrasina T, Ovesna P, et al. Volumetric analysis of hepatocellular carcinoma after transarterial chemoembolization and its impact on overall survival. In Vivo. 2022;36:2332-2341. [DOI] [PMC free article] [PubMed] [Google Scholar]

[vdaf271-B24] 24. Eisenhauer EA, Therasse P, Bogaerts J, et al. New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1). Eur J Cancer. 2009;45:228-247. [DOI] [PubMed] [Google Scholar]

[vdaf271-B25] 25. Yildirim O, Young RJ. Advanced imaging of IDH-mutant gliomas: precision in diagnosis and management. Adv Cancer Res. 2025;166:59-80. [DOI] [PubMed] [Google Scholar]

[vdaf271-B26] 26. Stall B, Zach L, Ning H, et al. Comparison of T2 and FLAIR imaging for target delineation in high grade gliomas. Radiat Oncol. 2010;5:5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[vdaf271-B27] 27. Gui C, Lau JC, Kosteniuk SE, Lee DH, Megyesi JF. Radiology reporting of low-grade glioma growth underestimates tumor expansion. Acta Neurochir (Wien). 2019;161:569-576. [DOI] [PubMed] [Google Scholar]

[vdaf271-B28] 28. Ellingson BM, Kim GHJ, Brown M, et al. Volumetric measurements are preferred in the evaluation of mutant IDH inhibition in non-enhancing diffuse gliomas: evidence from a phase I trial of ivosidenib. Neuro Oncol. 2022;24:770-778. [DOI] [PMC free article] [PubMed] [Google Scholar]

[vdaf271-B29] 29. von Reppert M, Ramakrishnan D, Bruningk SC, et al. Comparison of volumetric and 2D-based response methods in the PNOC-001 pediatric low-grade glioma clinical trial. Neurooncol Adv. 2024;6:vdad172. [DOI] [PMC free article] [PubMed] [Google Scholar]

[vdaf271-B30] 30. Khalili NK, Bagheri S, Familiar A, et al. A. Volumetric measurement of tumor size outperforms standard two-dimensional method in early prediction of tumor progression in pediatric glioma. Neuro Oncol. 2023;25:i48. A.F [Google Scholar]

[vdaf271-B31] 31. Dempsey MF, Condon BR, Hadley DM. Measurement of tumor "size" in recurrent malignant glioma: 1D, 2D, or 3D? AJNR Am J Neuroradiol. 2005;26:770-776. [PMC free article] [PubMed] [Google Scholar]

[vdaf271-B32] 32. Dos Santos T, Deverdun J, Chaptal T, et al. Diffuse low-grade glioma: what is the optimal linear measure to assess tumor growth? Neurooncol Adv. 2024;6:vdae044. [DOI] [PMC free article] [PubMed] [Google Scholar]

[vdaf271-B33] 33. Zhang Y, Zhong P, Jie D, et al. Brain tumor segmentation from multi-modal MR images via ensembling UNets. Front Radiol. 2021;1:704888. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Diagnosing growth in low-grade gliomas with and without artificial intelligence-measured longitudinal volume measurements: A retrospective observational study

Hassan M Fathallah-Shaykh

Houman Sotoudeh

Markus Bredel

Alex Whitley

Jinsuh Kim

Fanny E Morón

Fabio Raman

Nidhal Bouaynaya

Hayat Rahal

AbstractAbstract

Background

Methods

Results

Conclusions

Key Points.

Methods

Ethical Approval

Study Design and Patient Selection

Time to Growth Detected by Standard Clinical Care

The MRIMath Cyber Device

Tumor Segmentation, Physician Review, and Volume Calculations

Online Change-of-Point Detection

Results

Review Time, Dice Scores, and Variability

Clinical Progression Group

Table 1.

Figure 1.

Clinically Stable Group

Table 2.

Figure 2.

Negative Control Group

False Negative and False Positive Rates

False negative rates

Table 3.

False positive rates

Discussion

Conclusions

Supplementary Material

Contributor Information

Supplementary Material

Lay Summary

Author Contributions

Conflict of Interest Statement

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases