Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2025 Jul 29;15:27533. doi: 10.1038/s41598-025-08608-9

Prediction of MGMT methylation status in glioblastoma patients based on radiomics feature extracted from intratumoral and peritumoral MRI imaging

Wang-Sheng Chen 1,2,#, Fang-Xiong Fu 1,3,#, Qin-Lei Cai 1,#, Fei Wang 1, Xue-Hua Wang 1, Lan Hong 4,, Li Su 2,5,
PMCID: PMC12307707  PMID: 40730593

Abstract

Assessing MGMT promoter methylation is crucial for determining appropriate glioblastoma therapy. Previous studies have focused on intratumoral regions, overlooking the peritumoral area. This study aimed to develop a radiomic model using MRI-derived features from both regions. We included 96 glioblastoma patients randomly allocated to training and testing sets. Radiomic features were extracted from intratumoral and peritumoral regions. We constructed and compared radiomic models based on intratumoral, peritumoral, and combined features. Model performance was evaluated using the area under the receiver-operating characteristic curve (AUC). The combined radiomic model achieved an AUC of 0.814 (95% CI: 0.767–0.862) in the training set and 0.808 (95% CI: 0.736–0.859) in the testing set, outperforming models based on intratumoral or peritumoral features alone. Calibration and decision curve analyses demonstrated excellent model fit and clinical utility. The radiomic model incorporating both intratumoral and peritumoral features shows promise in differentiating MGMT methylation status, potentially informing clinical treatment strategies for glioblastoma.

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-025-08608-9.

Keywords: Glioblastoma, MGMT methylation, Radiomics, MRI imaging, Machine learning, Personalized treatment

Subject terms: Neurological disorders, Cancer

Introduction

Glioblastoma presents a formidable challenge due to its aggressive nature, high proliferation rate, and resistance to conventional treatments, leading to a median survival time of around 15 months1. The heterogeneity of glioblastoma at genetic and cellular levels, along with the limitations posed by the blood-brain barrier, further complicate treatment strategies2,3. Of particular importance in precision medicine for glioblastoma is the status of O6-methylguanine-DNA methyltransferase (MGMT), a DNA repair enzyme that influences response to alkylating agents, such as temozolomide4,5. Tumor MGMT promoter methylation, its DNA repair activity is inhibited, making tumor cells more sensitive to TMZ’s cytotoxic effects, has been associated with improved response to temozolomide and prolonged survival in patients6. Therefore, accurate determination of MGMT status is crucial for guiding treatment decisions and optimizing therapeutic outcomes in glioblastoma patients6.

The current mainstream technique for determining MGMT status in glioblastoma is methylation-specific polymerase chain reaction (MS-PCR) analysis7. This method is widely used due to its sensitivity, specificity, and relatively low cost. However, it has several limitations that may impact its accuracy and reliability, such as low DNA quantity obtained from small biopsy specimens and DNA quality variations influenced by tissue fixation and processing6,8. Radiomics technology offers a promising alternative that could overcome the limitations of traditional methods. By utilizing quantitative features extracted from medical images like MRI9, a radiomics model can be used to determine MGMT status in glioblastoma with high accuracy and reproducibility. For instance, Doniselli et al. performed a meta-analysis on 26 published studies concerning MGMT promoter methylation prediction10, suggesting extensive attention paid on this issue. However, most of the previous reports only focus on intratumoral regions, excluding the peritumoral region that may contain valuable information11,12. We hypothesized that combining MRI-based intratumoral and peritumoral radiomics could better predict MGMT status.

In this study, we aim to extract quantitative radiomics features from both intratumoral and peritumoral regions in MRI images. The features can be used to construct a radiomics model for predicting MGMT status in glioblastoma.

Materials and methods

Patients

As a retrospective study, a total of 96 patients with glioblastoma confirmed by histopathological biopsy from 2000 to 2023 were consecutively enrolled from the Department of Radiology at the Hainan General Hospital. The inclusion criteria were as follows: (i) Adults with histologically confirmed primary glioblastoma of the central nervous system; (ii) Isocitrate Dehydrogenase (IDH) expression status testing performed; (iii) Complete preoperative axial T1-weighted contrast-enhanced (T1C), T2, and (T2-fluid attenuated inversion recovery)T2-FLAIR data available; (iv) MR images obtained without artifacts affecting image observation and post-processing; (v) No prior radiotherapy, chemotherapy, or other treatments before surgery; (vi) Measurable enhancing lesions evident on post-contrast Gd-enhanced T1-weighted MRI within the 80% isodose line range following concurrent chemoradiotherapy (CCRT). The exclusion criteria were as follows: (i)Presence of motion artifacts, metal artifacts, etc., affecting image quality; (ii) Preexisting history of other tumors, surgical history, include extracranial conditions. The detailed process of patient recruitment is presented in Fig. 1. This retrospective study was approved by the institutional review board of Hainan General Hospital. Due to the retrospective nature of the study, the requirement for informed consent was waived by Hainan General Hospital Medical Ethics Committe.

Fig. 1.

Fig. 1

Flow chart of patient selection process.

MRI image acquisition

We chose T1C sequence, which can better observe the enhancement degree of tumor itself, and included T2 and T2-FLAIR sequence, which can better evaluate the edema of peritumoral area.Axial T2-weighted imaging was performed on a 3.0T MRI scanner (Siemens, Verio) using a fast spin-echo sequence with TR/TE parameters of 6000 ms/99 ms, a matrix of 320 × 192, a field of view (FOV) of 24 × 24 cm², and a voxel size of 0.4 × 0.4 × 6 mm³. The T2-FLAIR sequence had TR/TE parameters of 9000 ms/94 ms, a matrix of 320 × 192, an FOV of 24 × 24 cm², and a voxel size of 0.5 × 0.5 × 6 mm³. For contrast-enhanced T1-weighted imaging (T1C), a three-dimensional MPRAGE sequence was used with TR/TE parameters of 2100 ms/2.299 ms, a matrix of 512 × 512, an FOV of 23 × 23 cm², and a voxel size of 0.9 × 0.9 × 0.9 mm³. Before the examination, patients need to remove any metallic items, including dentures and metal bracelets. During the scan, it is crucial to remain as still as possible, particularly keeping the head immobile to avoid motion artifacts. The supine position is used for all examinations, with the head entering first. To ensure standardized scan alignment, the anterior commissure-posterior commissure (AC-PC) line is typically used as the reference line.

ROI segmentation

The tumor MRI images were manually segmented using 3DSlicer software (version 5.2.2;https://www.slicer.org/), and the entire tumor region of interest (ROI) was delineated on the T1C sequence, including the significantly enhanced part and liquefied necrosis area, avoiding blood vessels and recognizable peritumoral edema; and the peritumoral ROI was delineated on the T2WI-FLAIR sequence, is the peritumoral T2 high-signal area. This was performed by a neuroradiologist with 5 years of experience in neuroradiology and was subsequently verified by another neuroradiologist with 5 years of experience in the field. Any differences between the two neuroradiologists will be resolved by consensus. Neuroradiologists are unaware of other patient information when determining the region of interest. Finally, the tumor body and peri-tumor ROIs are automatically mapped to the corresponding positions on the remaining sequence images (Fig. 2) Radiomics featureselection.

Fig. 2.

Fig. 2

Representative case of ROI segmentation. (A) Original MRI image of T2-FLAIR sequence. (B) Intratumoral ROI. (C) Peritumoral ROI.

Radiomics feature selection

Radiomics features were autonomously extracted from the delineated intratumoral or peritumoral ROI of each MRI sequence using the Pyradiomics package in the training set. The initial radiomics features spanned eight primary categories: shape, first order, gray level co-occurrence matrix (GLCM), gray level run length matrix (GLRLM), gray level size zone matrix (GLSZM), gray level dependence matrix (GLDM), and neighbouring gray tone difference matrix (NGTDM). To curate the multidimensional features and mitigate overfitting, those with zero variance were excluded, and a t-test was carried out to discern statistically significant features. Additionally, Pearson correlation coefficients (PCC) were calculated between each pair of radiomics features. The less significant feature in t-test was removed if the PCC exceeded the threshold of ± 0.9. Furthermore, following these steps, the application of the least absolute shrinkage and selection operator (LASSO) regression was applied for further feature refinement (Fig. 3).

Fig. 3.

Fig. 3

Schematic representation of radiomics feature selection using LASSO regression. (A) The k-fold cross-validation technique, employed by varying the lambda (λ) parameters, identifies the optimal set of characteristic features. (B) The compression diagram illustrating the k-fold cross-validation approach for selecting characteristic features.

Model construction and performance evaluation

We calculated a radiomics feature score (Rad-score) for each patient based on the random forest (RF) model in the training set. To predict MGMT status, we evaluated the area under the receiver-operator characteristic (ROC) curve (AUC), classification accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) in both the training and testing sets. Decision curve analysis (DCA) was performed to assess clinical utility. And calibration curve was constructed to evaluate the accuracy of the predictive results.

Data preparation

The radiomics feature dataset consists of 8 categorical attributes, which necessitate pre - processing prior to the testing and evaluation of machine - learning models. After loading the dataset into the Python console, data categorization is performed via Python commands. Each attribute of the feature vector has a unique command. A one - hot encoder is employed to fit and transform the categorical variables. Before finalizing the optimal model for the research data, cross - validation is carried out using logistic regression and support vector machine learning models to better optimize the model(Figure 1S).

Cross-validation of machine learning models

This study aims to validate the most effective machine - learning model by employing four widely recognized algorithms (logistic regression, support vector machine, k - nearest neighbors, and random forest). The study compares the baseline results with cross - validation using a standardized dataset. The dataset was uniquely resampled with different n - splits and tested on machine - learning models, including logistic regression, random forest, support vector machine, and k - nearest neighbors. The process begins with dataset preparation by applying a one - hot encoder to categorical variables and normalizing the data. Subsequently, the accuracy of individual models is predicted, and 5 - fold cross - validation is performed after splitting the pre - processed radiomics feature dataset into training and test sets. The output will be further plotted using a learning curve with the same cross - validation. The confusion matrix and ROC - AUC curve of each model are compared with their respective model summaries, in combination with the learning curve of each step. This study adopts a rigorous approach to data preparation and model validation. The dataset consists of 8 categorical attributes, which have undergone pre - processing and feature vectorization. An 80:20 training - to - test split is used to maintain class proportions, followed by a 5 - fold cross - validation procedure for robust evaluation. Additionally, 5 - fold cross - validation is employed to plot the learning curve, ensuring a comprehensive assessment of model performance. As a result, data scientists can obtain the optimal model selection for similar data samples(Fig. 2S).

Statistical analysis

The independent sample t-test was conducted to assess continuous variables in the clinical and pathological characteristics, while chi-square test was conducted to analyze categorical variables. A two-tailed p value < 0.05 represented statistical significance. The statistical analyses and figure plotting were conducted using R software (version 4.3.1; https://www.r-project.org).

Results

Patient characteristics

The clinical information of the patients is shown in (Table 1). Of the 96 patients, 34 belonged to the MGMT methylated group and 62 patients belonged to the MGMT unmethylated group. No significant different were observed between the two groups regarding age, sex, and tumor characteristics. The patients were randomly assigned to either a training set (n = 75) and a testing set (n = 21), with details provided in(Table 2).

Table 1.

Clinical characteristics of patients grouped by MGMT methylation status.

Total Unmethylated Methylated P-value
(N = 96) (N = 62) (N = 34)
Sex
F 32 (33.3%) 22 (35.5%) 10 (29.4%) 0.706
M 64 (66.7%) 40 (64.5%) 24 (70.6%)
Age
 Mean (SD) 55.5 (13.4) 55.0 (13.2) 56.5 (14.0) 0.614
Tumor volume (mm3)
 Mean (SD) 32,100 (23100) 35,400 (23300) 26,100 (21900) 0.0568
Slice thickness (mm)
 Mean (SD) 5.29 (0.845) 5.34 (0.580) 5.19 (1.19) 0.474
Spatial resolution (mm)
 Mean (SD) 0.795 (0.250) 0.773 (0.253) 0.836 (0.244) 0.236

Table 2.

Clinical characteristics of patients in training and testing cohorts.

Total Train Test P-value
(N = 96) (N = 75) (N = 21)
MGMT
Unmethylated 62 (64.6%) 49 (65.3%) 13 (61.9%) 0.974
Methylated 34 (35.4%) 26 (34.7%) 8 (38.1%)
gender
 F 32 (33.3%) 27 (36.0%) 5 (23.8%) 0.432
 M 64 (66.7%) 48 (64.0%) 16 (76.2%)
Age
 Mean (SD) 55.5 (13.4) 54.7 (14.2) 58.4 (10.0) 0.183
Tumor volume (mm3)
 Mean (SD) 32,100 (23100) 33,200 (23300) 28,400 (22600) 0.399
Slice thickness (mm)
 Mean (SD) 5.29 (0.845) 5.23 (0.872) 5.48 (0.728) 0.203
Spatial resolution (mm)
 Mean (SD) 0.795 (0.250) 0.787 (0.252) 0.824 (0.248) 0.549

Intratumoral and peritumoral analysis

Following the selection of the radiomics features, the features most strongly correlated with MGMT methylation status were identified in multiple MRI sequences of either intratumoral or peritumoral ROI (Table S1). Thus, a series of radiomics models were developed to calculate Rad-scores that measure the probability of MGMT methylation (Table 3). In the testing set, the intratumoral model yielded AUC values ranging from 0.489 to 0.610, while the peritumoral model achieved AUC values ranging from 0.593 to 0.769.

Table 3.

The performance indicators of intratumoral and peritumoral models based on single or multiple MRI sequences.

Model AUC Accuracy C-index F1-score NPV PPV Precision Sensitivity Specificity
Intratumoral T1C 0.582 0.700 0.700 0.250 0.684 1.000 1.000 0.143 1.000
Intratumoral T2 0.511 0.550 0.550 0.571 0.833 0.429 0.429 0.857 0.385
Intratumoral T2-FLAIR 0.489 0.650 0.650 0.000 0.650 —— 0.000 0.000 1.000
Intratumoral Merged 0.610 0.650 0.650 0.462 0.714 0.500 0.500 0.429 0.769
Peritumoral T1C 0.747 0.650 0.650 0.632 0.875 0.500 0.500 0.857 0.539
Peritumoral T2 0.593 0.700 0.700 0.250 0.684 1.000 1.000 0.143 1.000
Peritumoral T2-FLAIR 0.769 0.700 0.700 0.625 0.818 0.556 0.556 0.714 0.692
Peritumoral Merged 0.703 0.700 0.700 0.250 0.684 1.000 1.000 0.143 1.000

Integrated radiomics analysis

We further combined the intratumoral and peritumoral features to develop an integrated radiomics model, which calculated a Rad-score for each sample. As shown in Fig. 4, the mean Rad-score in the methylated group was significantly higher than in the unmethylated group. The integrated radiomics model demonstrated strong discriminative performance with an AUC of 0.814 (95% CI: 0.767–0.862) in the 5-fold cross-validation of the training cohort and 0.808 (95% CI: 0.736–0.859) in the testing cohort (Fig. 5).

Fig. 4.

Fig. 4

Boxplots of Rad-score distribution in MGMT methylated and unmethylated groups.(A) Training set. (B) Testing set. ***, p < 0.001; *, p < 0.05.

Fig. 5.

Fig. 5

Receiver operating characteristic curves of radiomics model in the (A) training cohort and (B) testing cohort.

Figure 6A presents the calibration curve of the model. The blue classifier curve closely follows the orange perfect calibration line, especially within most prediction probability intervals, indicating favorable calibration performance. The Hosmer-Lemeshow (H-L) test yields a P-value of 0.08, exceeding the 0.05 significance level, suggesting no significant calibration bias and supporting the model’s calibration reliability. This demonstrates the model’s good calibration, with predicted probabilities aligning well with actual observations, accurately reflecting real-event likelihoods. Figure B shows the decision curve. The model’s net benefit surpasses the “treat all” and “treat none” strategies when the threshold probability is below 0.4, particularly between 0.1 and 0.3, indicating a clinical decision advantage. As the threshold probability rises, the net benefit declines, matching or falling below the “treat all” strategy beyond 0.4. This highlights the model’s high clinical applicability in the low-to-moderate threshold probability range, offering superior decision support. Furthermore, the calibration curve for the radiomics model demonstrated good fitness in the testing set (Fig. 6A). Decision curve analysis (Fig. 6B) also indicated that using the radiomics model to predict MGMT status provided more net benefit for treatment decisions across the entire threshold probability range of 0.0 to 1.0 compared to the ‘treat all’ and ‘treat none’ clinical models.

Fig. 6.

Fig. 6

Additional evaluation on radiomics model performance. (A) Calibration curve and 20 (B) decision curve for radiomics model in testing cohort.

Baseline machine learning models without cross-validation

In the model, four default machine - learning models with different machine - learning model parameters were designed (logistic regression with a maximum number of iterations n = 1000, SVC with a kernel, k - nearest neighbors, and random forest). Then, a loop was used. After fitting the model, the model was employed to calculate its accuracy score. This process involved splitting the data into training and test sets, having the model make predictions on the test data, and calculating the accuracy score of each model. The workflow included using the model fitted to the training set and then predicting the accuracy of the model. The console output indicated that the logistic regression and k - nearest neighbors models achieved the highest accuracy (81.9%), followed by the SVM with a linear kernel, which scored 78.6%. Specifically, logistic regression scored 81.9%, k - nearest neighbors scored 81.9%, the SVM with a linear kernel scored 78.6%, and the random forest scored 78.6%. These accuracy scores used the default settings to demonstrate the performance of the models. Moreover, the classification accuracy score in the multi - label problem was calculated based on the training set samples, where the true labels exactly matched the predicted labels. This measure effectively highlighted the differences among the models when dealing with the given dataset. Compared with the support vector machine and random forest models, the logistic regression and k - nearest neighbors models exhibited the highest prediction accuracy (81.9%). However, when evaluating macro - average metrics such as precision, recall, and F1 - score, the random forest model excelled, achieving a true - match accuracy range of 93–97%. This indicates that while logistic regression is more preferable when accuracy is the primary consideration, the random forest model is more suitable for scenarios where overall performance (including precision, recall, and F1 - score) is crucial for decision - making.

Machine learning models using cross-validation

Another approach for resampling radiological datasets in machine learning is to utilize cross - validation techniques. Cross - validation is essentially a process of evaluating all sample models by training each training/test model on the dataset. The final majority vote is determined after evaluating complementary subsets of the data. This process is highly effective when designing cross - validation to detect overfitting issues and generalize patterns. Each model is individually evaluated and fitted, and its accuracy score is calculated using cross - validation and an accuracy function. Table 2S outlines the average accuracy achieved through 5 - fold cross - validation. The random forest and k - nearest neighbors models achieved the highest accuracy of 84.15%. This is because these models can accommodate both datasets by applying decision - tree separation in a tree - like structure. Closely following, the logistic regression model reached an accuracy of 83.81%, while the support vector machine model achieved an accuracy of 82.49% on the normalized data samples. Notably, the random forest algorithm recorded a maximum accuracy of 90% in certain cases. Using the upgraded data samples, a 5 - fold cross - validation process was carried out to evaluate and compare machine - learning models, including logistic regression with a maximum iteration of n = 1000, SVM, k - nearest neighbors, and the random forest algorithm. For each model, the cross - validation score was computed using 5 - fold cross - validation scores. The average accuracy was calculated as the mean of the cross - validation scores and presented as a percentage with two decimal places. The generated table highlights the accuracy of each fold and the average accuracy of the sample data.

In Table 3S above, the combined machine - learning models with various parameters show the accuracy scores of each model. The k - nearest neighbors model has the highest accuracy of 84.15%, followed by the Logistic regression and random forest models, which have the second - lowest accuracy of 83.83%. The support vector machine model has the lowest accuracy, scoring 82.2%. Therefore, researchers may use the highest or lowest scores to evaluate model accuracy. Since the max/min values are obtained from the accuracies of five cross - validations, this approach to model evaluation may cause confusion. The accuracy scores using the max/min of the return values of each model depend on the settings of the normalization parameters. When researchers consider sample reshuffling during cross - validation iterations, the samples reshuffled using stratified values become true. For the high accuracy of large models in correctly classifying samples, the differences between each model are significant.

From Fig. 3S, the best-tuned models tested with predicted values and inverse parameter tuning yielded the following accuracy scores: logistic regression (81%), random forest (80.8%), support vector machine (83.1%), and k-nearest neighbors (slightly lower). All models demonstrated accuracy scores exceeding 90%, indicating satisfactory classification performance when combined and tested.

The accuracy scores of each model are compared against the highest and lowest scores, including scenarios with and without cyclical accuracy considerations (Fig. 4S, ). The logistic regression and support vector machine (SVM) models exhibited substantial discrepancies between non-cross-validated and cross-validated results, with SVM showing the largest difference. Conversely, the k-nearest neighbors (k-NN) model yielded the most consistent and optimal results among the four models. Similarly, when comparing a single independent model to multiple models with validation-supported SVM, the accuracy differed by 14% from the k-NN model. The closest model accuracy was within a 5% lower margin (Fig. 5S). Due to standard resampling using n-fold during model optimization, the baseline accuracy of machine learning models became lower than that of cross-validation.

Discussion

In this radiomics analysis, feature extraction was performed using T1C, T2, and T2-FLAIR MRI sequences. Three distinct radiomics approaches were developed to predict MGMT methylation status, including one based on intratumoral features, one on peritumoral features, and one on combined intra- and peritumoral features. The results showed that the rad-scores constructed from the combined intra- and peritumoral features yielded the highest AUCs in both the training and validation cohorts.

In numerous research endeavors focused on quantitative image feature assessment for glioblastoma, the utilization of only one MRI sequence has been a prevalent approach13,14. This single-sequence strategy often results in a constrained dataset, providing limited information that may not fully capture the complexity and heterogeneity of glioblastoma tumors. Recognizing this limitation, we assessed the efficacy of a multi-sequence model, which integrates data from various MRI sequences, in differentiating between MGMT methylated and MGMT unmethylated glioblastomas. In this study, the multi-sequence models showed better performance in most cases, but in the models of the region around the tumor, some single-sequence models may have performed better in some respects due to their specific imaging characteristics. This phenomenon suggests that we need to analyze and compare the performance of different models in various situations in more detail in future studies to determine the model that is most suitable for specific clinical tasks. This enhanced performance can be attributed to the richer and more comprehensive set of features available when multiple sequences are combined, allowing for a more nuanced and accurate classification. Our findings are consistent with a series of prior studies that have highlighted the advantages of utilizing multi-sequence MRI data1517.

Recently, radiomics has gained recognition as an effective method for extensively quantifying tumor phenotypes through the application of numerous quantitative imaging features. Numerous studies have concentrated on distinguishing between MGMT methylated and unmethylated glioblastomas prior to surgery using multimodal MRI images, with most research emphasizing features within intratumoral regions18. Nevertheless, several studies have shown that peritumoral region features also hold vital information19,20, evidenced by changes in the surrounding area of tumors, including peritumoral lymphatic vessel invasion, lymphocytic infiltration, and edema. Therefore, in this study, we extracted radiomic features from both intratumoral and peritumoral regions. The integrated model exhibited enhanced performance and substantially boosted its predictive capability. Hence, analyzing peritumoral radiomics might be useful for predicting MGMT status.

The radiomics features identified in this study can be roughly divided into three categories: (I) first-order statistical features; (III) second-order texture features; and (III) higher-order features: based on wavelet transform. These characteristics are objective quantitative tumor-related information that is difficult to detect with the naked eye, and usually reflect the pathophysiological information of the complex microcirculation and microenvironment inherent in the tumor. The tumor features extracted in this study mainly included texture features and baud signs, followed by statistical features. The extracted perinodular features mainly included statistical features and Baud signs, followed by texture features.

Whether in tumor features or peritumoral features, high-order features based on wavelet transform are common, which may be related to the fact that wavelet features decompose high-frequency and low-frequency areas and can reflect more microscopic information of the tumor. The first-order statistical characteristics reflect the changes in the symmetry, homogeneity and intensity of voxels, and further reflect the basic characteristics of tumors, such as signal intensity; while the second-order texture characteristics reflect the spatial arrangement between voxels gray levels. The relationship reveals internal information such as tumor heterogeneity. The higher the value, the greater the texture contrast of the image, and the higher the heterogeneity of the tumor.

The better predictive performance of the peritumoral model compared to the intratumoral model in this study may be related to the strong invasiveness of glioblastoma cells, resulting in significant heterogeneity of the peritumoral microenvironment compared to the tumor itself, which is also consistent with some previous studies21.There were no statistical differences between clinical and imaging features such as gender, age, and tumor volume and methylation of MGMT promoter, suggesting that methylation status of MGMT promoter is more closely related to microscopic features of tumor and mid-week.Taking a step forward, the efficacy of our radiomic model showed relatively good performance compared to some previous studies on the MGMT status prediction. As summarized in a prior meta-analysis, the pooled AUC of the 15 studies was estimated to be equal to 0.778, with more than half of them yielded AUC lower than 0.80010. This comparison underscores the incremental advancement achieved by our model. While the enhancement in AUC is not exceptionally large, it nonetheless signifies a step forward in predictive accuracy and robustness, suggesting that our approach integrating multiple MRI sequences as well as intratumoral and peritumoral ROIs may offer a more efficient tool for the intended application.

There are also some limitations to this study that must be addressed. Firstly, its retrospective design may have introduced selection bias. A prospective study, if feasible, would provide more robust insights. Secondly, the study was conducted at a single center with a limited patient cohort, which hindered the application of advanced data analysis methods. Therefore, the findings need external validation with a larger patient population, which allows for the development of deep learning model. Additionally, ROI was manually delineated slice-by-slice, which may introduce variability among radiologists. Moreover, the manual drawing of ROI is a laborious process. Future research could benefit from implementing automated glioblastoma segmentation methods22.

In this study, the radiomic model of the region around the tumor performed better than the radiomic model inside the tumor, which may be closely related to the characteristics of the tumor microenvironment. The radiomic characteristics of the surrounding area of the tumor can more effectively reflect the dynamic changes of the tumor microenvironment, which are closely related to the metastasis and progression of the tumor. In addition, there may be cystic areas and frequent necrosis within the tumor, which can degrade the performance of the internal tumor model. The characteristics of the region surrounding the tumor may provide more specific insights into tumor properties, thereby improving the predictive power of the model.

This model has certain universality in the future similar research. With the deepening understanding of the role of the tumor microenvironment in cancer progression and metastasis, more and more studies have begun to focus on the radiomic characteristics of the region around the tumor. These studies have consistently shown that the radiomic characteristics of the peritumor area are of great value in predicting the therapeutic response and prognosis of the tumor. Therefore, incorporating the radiomic characteristics of the region around the tumor into the model construction can provide a more comprehensive and accurate prediction tool for future studies, and help realize precision medicine for cancer.

In conclusion, we developed and validated an MRI radiomics model based on machine learning to differentiate MGMT methylation status in glioblastoma patients. This model offers a non-invasive, stable, and relatively accurate method for predicting MGMT methylation, which has significant potential to assist in clinical decision-making for personalized treatment. Our study demonstrates that peritumoral radiomics features provide unique biological insights and enhance the model’s performance, suggesting that incorporating peritumoral radiomics could serve as a valuable adjunct to traditional biopsy-based methods. Future research should focus on validating these findings in larger cohorts and exploring how peritumoral radiomics can be optimally integrated into clinical workflows for maximum benefit.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1 (282.2KB, docx)

Acknowledgements

Not Applicable.

Abbreviations

MRI

magnetic resonance imaging

MGMT

O6-Methylguanine-DNA Methyltransferase

ROC

receiver-operator characteristic

AUC

area under curve

MS-PCR

methylation-specific polymerase chain reaction

CCRT

concurrent chemoradiotherapy

T2

Weighted imaging

T1C

T1-weighted contrast-enhanced

FLAIR

fluid attenuated inversion recovery

FOV

field of view

TR

Repetition Time

TE

Echo Time

MPRAGE

Magnetization prepared rapid gradient echo

AC-PC

anterior commissure-posterior commissure

ROI

region of interest

GLCM

gray level co-occurrence matrix

PME

peritumor microenvironment

IDH

Isocitrate Dehydrogenase

IRB

Institutional Review Board

Author contributions

Lan Hong, Li Su conceived of the study, and Qin-Lei Cai, Xue-Hua Wang and Fei Wang participated in its design and data analysis and statistics. Wang-Sheng Chen and Fang-Xiong Fu helped to draft the manuscript. All authors read and approved the final manuscript.

Funding

This work was supported by the National Natural Science Foundation of China [grant number 82360342]; the Joint Program on Health Science & Technology Innovation of Hainan Province [grant number WSJK2024MS227]; the Alzheimer’s Research UK Senior Research Fellowship [grant number ARUK-SRF2017B-1 ], and the National Institute for Health and Care Research (NIHR) Sheffield Biomedical Research Centre [grant number NIHR203321],and the Academic Enhancement Support Program of Hainan Medical University(grant number XSTS2025010)

Data availability

All data generated or analysed during this study are included in this article. Further enquiries can be directed to the corresponding author.

Declarations

Ethics approval and consent to participate

This study was approved by the Institutional Review Board (IRB) of Hainan General Hospital. Informed consent was waived by the IRB because of the retrospective design of the study and the anonymized clinical data used in the analysis. The study was performed in accordance with the Declaration of Helsinki.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally: Wang-Sheng Chen, Fang-Xiong Fu, Qin-Lei Cai

Contributor Information

Lan Hong, Email: honglan625402542@163.com.

Li Su, Email: l.su@sheffield.ac.uk.

References

  • 1.Preusser, M. et al. Prospects of immune checkpoint modulators in the treatment of glioblastoma. Nat. Rev. Neurol.11 (9), 504–514 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Ou, A., Yung, W. K. A. & Majd, N. Molecular mechanisms of treatment resistance in glioblastoma. Int. J. Mol. Sci., 22(1). (2020). [DOI] [PMC free article] [PubMed]
  • 3.Balça-Silva, J. et al. Cellular and molecular mechanisms of glioblastoma malignancy: Implications in resistance and therapeutic strategies. Semin Cancer Biol.58, 130–141 (2019). [DOI] [PubMed] [Google Scholar]
  • 4.Brandes, A. A. et al. MGMT promoter methylation status can predict the incidence and outcome of pseudoprogression after concomitant radiochemotherapy in newly diagnosed glioblastoma patients. J. Clin. Oncol.26 (13), 2192–2197 (2008). [DOI] [PubMed] [Google Scholar]
  • 5.Gilbert, M. R. et al. Dose-dense Temozolomide for newly diagnosed glioblastoma: A randomized phase III clinical trial. J. Clin. Oncol.31 (32), 4085–4091 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Mansouri, A. et al. MGMT promoter methylation status testing to guide therapy for glioblastoma: Refining the approach based on emerging evidence and current challenges. Neuro Oncol.21 (2), 167–178 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Cankovic, M. et al. A simplified laboratory validated assay for MGMT promoter hypermethylation analysis of glioma specimens from formalin-fixed paraffin-embedded tissue. Lab. Invest.87 (4), 392–397 (2007). [DOI] [PubMed] [Google Scholar]
  • 8.Wick, W. et al. MGMT testing–the challenges for biomarker-based glioma treatment. Nat. Rev. Neurol.10 (7), 372–385 (2014). [DOI] [PubMed] [Google Scholar]
  • 9.Lambin, P. et al. Radiomics: the Bridge between medical imaging and personalized medicine. Nat. Rev. Clin. Oncol.14 (12), 749–762 (2017). [DOI] [PubMed] [Google Scholar]
  • 10.Doniselli, F. M. et al. Quality assessment of the MRI-radiomics studies for MGMT promoter methylation prediction in glioma: a systematic review and meta-analysis. Eur. Radiol., (2024). [DOI] [PMC free article] [PubMed]
  • 11.Pease, M. et al. Pre-operative MRI radiomics model non-invasively predicts key genomic markers and survival in glioblastoma patients. J. Neurooncol. 160 (1), 253–263. 10.1007/s11060-022-04150-0 (2022). [DOI] [PubMed] [Google Scholar]
  • 12.Do, D. T., Yang, M. R., Lam, L. H. T., Le, N. Q. K. & Wu, Y. W. Improving MGMT methylation status prediction of glioblastoma through optimizing radiomics features using genetic algorithm-based machine learning approach. Sci. Rep.12 (1), 13412 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Lu, Y. et al. Machine learning-based radiomic, clinical and semantic feature analysis for predicting overall survival and MGMT promoter methylation status in patients with glioblastoma. Magn. Reson. Imaging. 74, 161–170 (2020). [DOI] [PubMed] [Google Scholar]
  • 14.Vils, A. et al. Radiomic analysis to predict outcome in recurrent glioblastoma based on Multi-Center MR imaging from the prospective DIRECTOR trial. Front. Oncol.11, 636672 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Feng, L. et al. Development and validation of a radiopathomics model to predict pathological complete response to neoadjuvant chemoradiotherapy in locally advanced rectal cancer: a multicentre observational study. Lancet Digit. Health. 4 (1), e8–e17 (2022). [DOI] [PubMed] [Google Scholar]
  • 16.Bathla, G. et al. Radiomics-based differentiation between glioblastoma and primary central nervous system lymphoma: a comparison of diagnostic performance across different MRI sequences and machine learning techniques. Eur. Radiol.31 (11), 8703–8713 (2021). [DOI] [PubMed] [Google Scholar]
  • 17.Wang, H. et al. Elaboration of a multisequence MRI-based radiomics signature for the preoperative prediction of the muscle-invasive status of bladder cancer: A double-center study. Eur. Radiol.30 (9), 4816–4827 (2020). [DOI] [PubMed] [Google Scholar]
  • 18.Xi, Y. B. et al. Radiomics signature: A potential biomarker for the prediction of MGMT promoter methylation in glioblastoma. J. Magn. Reson. Imaging. 47 (5), 1380–1387 (2018). [DOI] [PubMed] [Google Scholar]
  • 19.Cheng, J. et al. Prediction of glioma grade using intratumoral and peritumoral radiomic features from multiparametric MRI images. IEEE/ACM Trans. Comput. Biol. Bioinform. 19 (2), 1084–1095 (2022). [DOI] [PubMed] [Google Scholar]
  • 20.Tan, R. et al. MRI-based intratumoral and peritumoral radiomics for preoperative prediction of glioma grade: A multicenter study. Front. Oncol.14, 1401977 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.YUAN, Y. et al., Application value of MRI peritumoral radiomics features in differentiation between high-grade glioblastoma and single brain metastasis. J. Mod. Oncol31(6):1103–1107 . (2023). [Google Scholar]
  • 22.Cheng, J. et al. A fully automated multimodal MRI-Based Multi-Task learning for glioma segmentation and IDH genotyping. IEEE Trans. Med. Imaging. 41 (6), 1520–1532 (2022). [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1 (282.2KB, docx)

Data Availability Statement

All data generated or analysed during this study are included in this article. Further enquiries can be directed to the corresponding author.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES