Skip to main content
BMC Medical Informatics and Decision Making logoLink to BMC Medical Informatics and Decision Making
. 2024 Oct 31;24:320. doi: 10.1186/s12911-024-02722-w

Prediction of femoral head collapse in osteonecrosis using deep learning segmentation and radiomics texture analysis of MRI

Shihua Gao 1, Haoran Zhu 2, Moshan Wen 2, Wei He 3,4, Yufeng Wu 1, Ziqi Li 3,4,, Jiewei Peng 1,
PMCID: PMC11526660  PMID: 39482688

Abstract

Background

Femoral head collapse is a critical pathological change and is regarded as turning point in disease progression in osteonecrosis of the femoral head (ONFH). In this study, we aim to build an automatic femoral head collapse prediction pipeline for ONFH based on magnetic resonance imaging (MRI) radiomics.

Methods

In the segmentation model development dataset, T1-weighted MRI of 222 hips from two hospitals were retrospectively collected and randomly split into training (n = 190) and test (n = 32) sets. In the prognosis prediction model development dataset, 206 hips were also retrospectively collected from two hospitals and divided into training set (n = 155) and external test set (n = 51) according to data source. A deep learning model for automatic lesion segmentation was trained with nnU-Net, from which three-dimensional regions of interest were segmented and a total of 107 radiomics features were extracted. After intra-class correlation coefficients screening, feature correlation coefficient screening and Least Absolute Shrinkage and Selection Operator regression feature selection, a machine learning model for ONFH prognosis prediction was trained with Logistic Regression (LR) and Light Gradient Boosting Machine (LightGBM) algorithm.

Results

The segmentation model achieved an average dice similarity coefficient of 0.848 and an average 95% Hausdorff distance of 3.794 in the test set, compared to the manual segmentation results. After feature selection, nine radiomics features were included in the prognosis prediction model. External test showed that the LightGBM model exhibited acceptable predictive performance. The area under the curve (AUC) of the prediction model was 0.851 (95% CI: 0.7268–0.9752), with an accuracy of 0.765, sensitivity of 0.833, and specificity of 0.727. Decision curve analysis showed that the LightGBM model exhibited favorable clinical utility.

Conclusion

This study presents an automated pipeline for predicting femoral head collapse in ONFH with acceptable performance. Further research is necessary to determine the clinical applicability of this radiomics-based approach and to assess its potential to assist in treatment decision-making for ONFH.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12911-024-02722-w.

Keywords: Osteonecrosis of the femoral head, MRI, Deep learning, Machine learning, Radiomics

Introduction

Osteonecrosis of the femoral head (ONFH) predominantly affects young and middle-aged adults. ONFH may progress into collapse of the femoral head, leading to rapid joint destruction, resulting in severe hip pain and dysfunction [1]. Femoral head collapse has been generally viewed as the turning point of disease progression in ONFH. Assessing the risk of collapse progression in ONFH is crucial, as patients at lower risk may benefit from conservative treatment, whereas those at higher risk may require hip preservation surgery [2, 3].

Multiple methods have been developed to evaluate the risk of femoral head collapse using radiological examinations [4, 5]. However, it should be noted that the current evaluation methods primarily depend on manual analysis, which may introduce risk of inconsistency between observers. Furthermore, the performance of existing methods in predicting femoral head collapse requires further improvement.

Radiomics is an image analysis method that quantitatively extracts deep information in medical images. Combined with machine learning, radiomics analysis could provide more comprehensive insights from medical imaging data [68]. A substantial body of oncology research employs radiomics to explore tumor imaging heterogeneity and develop models that enhance clinical diagnosis and treatment [912]. Therefore, it is feasible to employ radiomics analysis to investigate the heterogeneity within ONFH lesions and explore the possibility of clinical applications in the context of ONFH.

Magnetic resonance imaging (MRI) is a reliable examination for the diagnosis of ONFH [13]. Compared with radiography or computed tomography, MRI offers a distinct advantage in diagnosing and assessing early-stage ONFH, as it can provide clearer visualization of necrotic lesions, facilitating the delineation of regions of interest (ROI) for radiomics feature extraction. However, the process of manual delineation of ROIs on three-dimensional medical images is often time-consuming and labor-intensive. Furthermore, the inherent variability in delineation results among different observers may lead to inconsistencies in radiomics features. In order to facilitate clinical utility and ensure reproducibility, it is imperative to implement automatic segmentation of ROIs.

Recent studies have underscored the effectiveness of deep learning and radiomics in diagnosing avascular necrosis of the femoral head [1419]. For example, Klontzas et al. highlighted the utility of convolutional neural network ensemble model for differential diagnosis in hip pathologies, further confirming the role of deep learning in enhancing the diagnostic precision of ONFH [16]. Collectively, these studies highlight the significant potential of deep learning and radiomics in the diagnosis of ONFH [1419]. However, most previous studies have focused on the diagnosis of ONFH, with less emphasis on exploring the prognostic value of deep learning and radiomics for predicting outcomes such as femoral head collapse.

Therefore, our study aims to address this gap by leveraging deep learning and radiomics to predict the risk of femoral head collapse. By integrating a deep learning segmentation model with radiomics analysis, we aim to develop a pipeline for automatic and objective risk assessment of femoral head collapse in ONFH.

Methods

Patients and datasets

Patients with ONFH were retrospectively collected from two centers (Center 1: The First Affiliated Hospital of Guangzhou University of Chinese Medicine, Center 2: The Third Affiliated Hospital of Guangzhou University of Chinese Medicine) between 2017 and 2020. The inclusion criteria were (a) diagnosis of non-traumatic ONFH with reference to Association Research Circulation Osseous (ARCO) consensus [13], (b) pre-collapse ONFH, including ARCO stage I and stage II, and (c) completing MRI scan at initial visit. The exclusion criteria were (a) inaccessibility for MRI data, (b) history of trauma or surgery of the affected hip, and (c) patients under 18 years of age. Ethical approval was obtained, and informed consent was waived due to the retrospective nature of the study design.

A total of 718 hips diagnosed with ONFH were screened across two hospitals (421 from Center 1, 297 from Center 2). According to the inclusion and exclusion criteria, 428 hips with ONFH were finally enrolled. Among these hips, 222 hips with incomplete follow-up data were allocated to the deep learning segmentation model development dataset (random split: 190 in training set, 32 in test set), while 206 hips with complete follow-up were assigned to the radiomics prognosis prediction model development dataset (155 from Center 1 as training set, 51 from Center 2 as test set). The overview of datasets generation is depicted in Fig. 1.

Fig. 1.

Fig. 1

Study flow chart of the enrolled patients

Imaging and annotation

MRI data of both datasets were acquired using three MRI scanners: Siemens Prisma 3.0 T, GE Signa HDxt 3.0 T and GE Signa HDxt 1.5 T. Coronal T1-weighted turbo spin echo (TSE) images were obtained for model training and testing, and the parameters were outlined as follows: TR/TE: 350–960/7–17 ms; slice thickness: 3–5 mm (Supplementary Table S1). Coronal T1-weighted images were cropped as left and right hips. Z-score normalization and resampling were performed on the images using SimpleITK (version 2.1.0). The resampled pixel spacing was set to the median of all image data, which was 0.74 mm × 0.74 mm, and no resampling operation was performed on the sagittal axis.

In the segmentation model development dataset, three-dimensional ROIs capturing osteonecrosis lesion contours throughout the femoral head were segmented on coronal T1-weighted TSE images, by a senior surgeon (ZQ.L.) who was blinded to the clinical outcomes, with the aid of T2-weighted MR images according to the ARCO consensus [20]. Manual segmentation was performed using 3D Slicer (version 5.0.3) [21]. The manual segmented ROIs were served as the ground truth for model training and testing.

The Japanese Investigation Committee (JIC) classification of included hips was determined by the mid-coronal slice of T1-weighted TSE images [22]. Patients included in prediction model development dataset al.l underwent conservative treatment and were followed at three-month intervals using anteroposterior (AP) and frog-leg (FL) pelvic radiographs until either femoral head collapse occurred or a follow-up period of two years without evidence of collapse was completed Femoral head collapse was assessed with the concentric circles method on radiographs of both AP and frog-leg lateral FL views using Image J software (version 1.52a). Femoral head collapse was defined as an increase in the amount of collapse by more than 2 mm during the two-year follow-up (i.e. from ARCO stage I or II to stage IIIA or IIIB) in either AP or FL view [23, 24].

Segmentation model training and evaluation

Osteonecrosis lesion segmentation model was trained using the nnU-Net algorithm.The 3D full resolution U-Net network was utilized. A five-fold cross-validation strategy was employed during model training, and each fold was trained for 100 epochs. The optimizer utilized was stochastic gradient descent, with an initial momentum of 0.99, a decay value of 3e-5, and an initial learning rate of 0.01. The learning rate followed a polynomial curve decay. The dice similarity coefficient (DSC), 95% Hausdorff distance (95% HD) and average surface distance (ASD) were used to assess the segmentation performance of the model. To enhance robustness of the model, after completing the five-fold cross-validation on the training set, the five best-performing networks were chosen within each fold based on DSC, and then ensembled to create the final model for evaluation on the test set. The training and evaluation of the segmentation model were conducted on PyTorch (version 1.11.0) with CUDA (version 11.3).

Feature extraction and selection

Radiomics feature extraction was performed on the prediction model development dataset using PyRdiomics (version 3.0.1). Three-dimensional ROIs of the dataset were segmented by previously trained nnU-Net ensemble model. A total of 107 features were extracted, including 18 first-order statistical features, 14 shape features, 24 features derived from the gray level co-occurrence matrix (GLCM), 14 features derived from the gray level dependence matrix (GLDM), 16 features derived from the gray level run length matrix (GLRLM), 16 features derived from the gray level size zone matrix (GLSZM), and 5 features derived from the neighboring gray tone difference matrix (NGTDM).

Radiomics features of the test set in the segmentation model development dataset were extracted using both with automatically and manually segmented ROIs. The intra-class correlation coefficients (ICC) of the radiomics features were calculated to assess the consistency of the extracted features between these two extraction methods, and only features with ICC > 0.75 were recognized as stable and used for further analysis.

Further feature selection was conducted on the training set of prediction model development dataset. After normalization of the extracted radiomics features, Spearman correlation comparison was conducted. Features with a correlation coefficient greater than 0.9 were identified as highly correlated, and one of them was subsequently excluded. The Least Absolute Shrinkage and Selection Operator (LASSO) algorithm with ten-fold cross-validation was applied for further selection of radiomics features. Feature selection was performed using Scikit-learn (version 1.0.2).

Prediction model training and evaluation

Based on the selected radiomics features, binary classification models were developed to predict the occurrence of femoral head collapse with Logistic Regression (LR), Light Gradient Boosting Machine (LightGBM), Random Forest (RF), Support Vector Machine (SVM), K Nearest Neighbors (KNN) and eXtreme Gradient Boosting (XGBoost).

To mitigate the impact of sample imbalance, oversampling with the adaptive synthetic sampling (ADASYN) method was conducted in the training set. Grid search with five-fold cross-validation was employed to decide the hyper-parameters of the model (LR: penalty = ‘l2’, solver = ‘lbfgs’; LightGBM: learning_rate = 0.1, n_estimators = 10, max_depth = 3, objective = ‘binary’; RF: n_estimators = 10, max_depth = 3, min_sample_split = 2; SVM: probability = True; KNN: algorithm = ‘kd_tree’; XGBoost: n_estimators = 10, objective = ‘binary: logistic’, max_depth = 3, use_label_encoder = False, eval_metric = ‘error’).

Evaluation metrics including accuracy (ACC), sensitivity (SEN), specificity (SPE), positive predictive value (PPV), and negative predictive value (NPV) were employed. The predictive performance was evaluated using receiver operating characteristic (ROC) curve analysis and the area under the curve (AUC). Additionally, decision curve analysis was conducted to assess the clinical utility of the models. The performance of the prediction model was validated on the external test set. As LightGBM is a tree-based algorithm, the importance of the included radiomics features was analyzed through the frequency the feature was used and the gain of using the feature. The modeling and evaluation processes were also conducted using Scikit-learn (version 1.0.2).

The overview of the workflow is shown in Fig. 2. This study adhere to the CheckList for EvaluAtion of Radiomics research (CLEAR) guideline [25], and the overall quality of the pipeline was assessed by the METhodological RadiomICs Score (METRICS) tool [26], with a METRICS score of 88.8%.

Fig. 2.

Fig. 2

Work flow of the study

Statistical analysis

The characteristics of the hips included were compared between the training and test sets. Continuous data were presented as means and standard deviation, while categorical data were presented as frequencies and percentages.After normality was confirmed with the Shapiro-Wilk test and homogeneity of variances with Levene’s test, comparisons of continuous data were conducted using the student’s t-test. Comparisons of categorical variables were performed using the chi-square test. The P-values were the result of two-sided test, and the test level was set at 0.05. The statistical analysis was conducted with R software (version 3.0.1).

Results

Patient characteristics

A total of 428 patients were enrolled in the study. No significant clinical characteristics differences were found between training set (n = 190) and test set (n = 32) in the segmentation model dataset. Similarly, no significant differences were observed in the characteristics between the training set (n = 155) and the external test set (n = 51) of the prognosis prediction model dataset. In the training set, 34.8% of the affected hips progressed to femoral head collapse, while 35.3% of the hips in the test set experienced collapse of the femoral head. Details of the included patients are listed in Table 1.

Table 1.

Characteristics of the study dataset

Segmentation model dataset Prediction model dataset
Training set Test set P Training set Test set P
N 190 32 155 51
Affected side 1.000 0.469
 Left 104 (54.7%) 18 (56.3%) 84 (54.2%) 24 (57.1%)
 Right 86 (45.3%) 14 (43.8%) 71 (45.8%) 27 (52.9%)
Age (years) * 38.15 ± 10.17 38.25 ± 9.82 0.958 40.06 ± 9.21 37.20 ± 12.10 0.077
Gender 0.558 0.423
 Male 123 (64.7%) 23 (71.9%) 101 (65.2%) 37 (72.5%)
 Female 67 (35.3%) 9 (28.1%) 54 (34.8%) 14 (27.5%)
JIC classification 0.616 0.111
 Type A 13 (6.8%) 2 (6.2%) 6 (3.9%) 0 (0.0%)
 Type B 17 (8.9%) 5 (15.6%) 33 (21.3%) 6 (11.8%)
 Type C1 91 (47.9%) 16 (50.0%) 61 (39.4%) 28 (54.9%)
 Type C2 69 (36.3%) 9 (28.1%) 55 (35.5%) 17 (33.3%)
ARCO stage 1.000 0.787
 Stage I 11 (5.8%) 2 (6.3%) 6 (3.9%) 3 (5.9%)
 Stage II 179 (94.2%) 30 (93.8%) 149 (96.1%) 48 (94.1%)
Etiology 0.911 0.15
 Steroidal 103 (54.2%) 18 (56.2%) 91 (58.7%) 23 (45.1%)
 Alcoholic 72 (37.9%) 11 (34.4%) 47 (30.3%) 23 (45.1%)
 Idiopathic 15 (7.9%) 3 (9.4%) 17 (11.0%) 5 (9.8%)
Collapse - - - 1.000
 Absence - - 101 (65.2%) 33 (64.7%)
 Presence - - 54 (34.8%) 18 (35.3%)

JIC: Japanese Investigation Committee, ARCO: Association Research Circulation Osseous; * means normal distribution

Performance of lesion segmentation

Table 2 shows the segmentation performance of nnU-Net on training set and test set. In the five-fold cross-validation, the average DSC ranged from 0.835 to 0.894, and the average 95% HD ranged from 2.486 to 4.026. On the test set, the ensemble model demonstrated an average DSC of 0.848, an average 95% HD of 3.794, and an average ASD of 0.554, indicating that the model achieved promising performance in lesion segmentation of ONFH (Fig. 3).

Table 2.

Performance of the nnu-net segmentation model

Cross-validation of training set Test set
Fold 0 Fold 1 Fold 2 Fold 3 Fold 4 Ensemble
DSC 0.894 0.836 0.845 0.848 0.835 0.848
95% HD 2.486 3.574 3.942 3.514 4.026 3.794
ASD 0.317 0.488 0.587 0.492 0.613 0.554

DSC: dice similarity coefficient, 95% HD: 95% Hausdorff distance, ASD: average surface distance

Fig. 3.

Fig. 3

Schematic diagram of lesion segmentation. A-C: MRI slices from a random patient in the test set; D-F: nnU-Net model segmentation result; G-I: manual segmentation result

Feature extraction and selection

A total of 107 radiomics features were extracted (Figure S1), among which 79 features were regarded as stable with ICC > 0.75. Subsequently, Spearman correlation coefficients were calculated to identify and remove redundant features with correlation coefficients higher than 0.9, resulting in the exclusion of a total of 25 features (Figure S2). Finally, LASSO regression was performed for further feature selection, after removal of 45 features (Figure S3), the remaining nine radiomic features for prediction model construction were as follows: First order kurtosis, GLRLM run length non-uniformity, GLSZM gray level variance, GLSZM large area low gray level emphasis, GLSZM small area high gray level emphasis, NGTDM busyness, shape elongation, shape major axis length and shape voxel volume.

Performance of collapse prediction

Six algorithms were utilized to construct prediction models for femoral head collapse based on the selected nine radiomics features. The RF and XGBoost model showed significant overfitting on the training set.The LightGBM demonstrated the highest AUC on both the training set and external test set compared to the LR, SVM and KNN models (Fig. 4).

Fig. 4.

Fig. 4

Receiver operating characteristic curves of prediction models. Performance of the machine learning models on the training set (A) and the external test set (B). AUC: area under the curve

The LightGBM algorithm showed superior performance compared to the simpler LR algorithm. ROC analysis of the prediction model showed the AUC of LR classifier was 0.857 (95% CI: 0.793–0.920) in the training set and 0.816 (95% CI: 0.679–0.953) in the test set, while the AUC of LightGBM classifier was 0.924 (95% CI: 0.890–0.959) in the training set and 0.851 (95% CI: 0.727–0.975) in the test set. Table 3 presents the ACC, SEN, SPE, PPV, NPV and F1-scores of the prediction model when the threshold was set at 0.5.

Table 3.

Performance of the femoral head collapse prediction model

LR LightGBM
Training set Test set Training set Test set
ACC 0.811 0.753 0.831 0.765
SEN 0.820 0.811 0.819 0.833
SPE 0.798 0.756 0.842 0.727
PPV 0.783 0.710 0.828 0.625
NPV 0.809 0.783 0.833 0.889
AUC (95% CI) 0.857 (0.793–0.920) 0.816 (0.679–0.953) 0.924 (0.890–0.959) 0.851 (0.727–0.975)

LR: Logistic Regression, LightGBM: Light Gradient Boosting Machine, ACC: accuracy, SEN: sensitivity, SPE: specificity, PPV: positive predictive value, NPV: negative predictive value

Decision curve analysis was conducted to identify the model score interval that could benefit patients from the suggestion of the prediction model. Figure 5 demonstrates that the clinical benefit of the LightGBM model was exceeded zero when the threshold was set between 0.22 and 0.76 in the training set, and the model score interval was similar in test set, ranging from 0.16 to 0.77.

Fig. 5.

Fig. 5

Decision curve analysis of the LightGBM model. Decision curve analysis showed the LightGBM model could bring clinical benefit when the threshold was set at 0.22 to 0.76 in the training set (A), and at 0.16 to 0.77 in the test set (B)

Feature importance analysis of the LightGBM model revealed that the GLSZM gray level variance was of highest importance, followed by shape features, including shape elongation, shape voxel volume, and shape major axis length. Next were NGTDM busyness and GLSZM small area high gray level were identified as being of equal importance. The least contributing features in the model were GLRLM run length non-uniformity, GLSZM large area low gray level emphasis, and first-order kurtosis (Fig. 6).

Fig. 6.

Fig. 6

Feature importance of the LightGBM model

Discussion

In this study, we utilized the nnU-Net deep learning algorithm to develop an automatic segmentation model for ROI delineation on T1-weighted MRI images. The model demonstrated a high degree of accuracy in segmenting necrotic lesions. We further employed the segmentation model to automatically obtain ROIs of the predictive model dataset and extract radiomics features. After conducting feature engineering, nine radiomics features were selected. Machine learning based classifiers for collapse prediction were then trained. The results showed that LightGBM model achieved the best performance among these classifier, with an AUC of 0.924 (95% CI: 0.890–0.959) in the training set and 0.851 (95% CI: 0.727–0.975) in the external test set. The findings of this study indicated the automated analysis pipeline not only ensures reproducibility but also delivers robust predictions for femoral head collapse, thereby facilitating informed decision-making in the management of ONFH.

Currently, a considerable amount of research concentrates on employing deep learning techniques for medical image analysis [2729]. Deep learning has emerged as a powerful tool in the diagnosis of ONFH as well, with studies reporting accuracy rates of up to 97.62% in distinguishing ONFH from transient osteoporosis using MRI data [2729]. In 2015, Ronneberger et al. introduced the U-Net convolutional neural network architecture, which exhibited exceptional performance in various medical image segmentation tasks [30]. Based on 2D and 3D U-Net architectures, Fabian et al. developed an improved adaptive framework named nnU-Net, which demonstrated top performance in the Medical Segmentation Decathlon and confirmed the efficacy of nnU-Net for a variety of medical image segmentation tasks [31, 32]. Adrian et al. demonstrated the capability of nnU-Net to automatically segment and quantify necrotic lesions, offering a more objective assessment than traditional 2D measurements like the Kerboul angle. However, due to a relatively small sample size of 30 hips, a 5-fold cross-validation was performed, with a DSC of 0.75. In this study, we further verify the suitability of nnU-Net for segmenting necrotic lesions of ONFH on MRI using a larger dataset, with a DSC of 0.848 on the test set, indicating a possible superior segmentation performance.

In this study, radiomic features were extracted from both manual and automatic segmentation masks, and only radiomic features with good consistency were selected for subsequent model construction to ensure the reliability of the selected features. After further feature selection, a total of nine radiomics features were chosen for prediction model construction. These included three shape features—shape elongation, shape major axis length, and shape voxel volume; five texture features—GLRLM run length non-uniformity, GLSZM gray level variance, GLSZM large area low gray level emphasis, GLSZM small area high gray level emphasis, and NGTDM busyness; and one first-order feature—first-order kurtosis. These radiomics features collectively describe the heterogeneity of the lesion of ONFH, and could serve as predictors for femoral head collapse.

It is widely recognized that the occurrence of femoral head collapse is closely related to the extent of the necrotic lesion [33]. The Steinberg classification system assesses the size of the necrotic lesion by visually estimating the percentage it occupies within the femoral head (< 15%, 15–30%, > 30%) [34]. The Kerboul classification categorizes necrosis by measuring the angles affected by the necrotic lesion on X-rays or MRI [35, 36]. Additionally, Wu et al. discovered that the geometric shape of the necrotic lesions on MRI has predictive value for femoral head collapse [37]. Consistent with prior research, feature importance analysis reveals that the three shape features, including shape elongation, shape major axis length, and shape voxel volume, all significantly contribute to the prediction of collapse, providing a comprehensive quantitative description of the regional morphology and capturing the geometric complexity of the necrotic lesion.

Texture features in radiomics are statistical measurements of matrices after transformation of the original images [38]. According to the feature importance analysis of the LightGBM model, GLSZM gray level variance makes the most significant contribution to the prediction of femoral head collapse. Our findings align with the observations made by Shimizu et al. [39], who categorized the texture of necrotic lesions into high, mixed, and low intensity types. They noted the mixed intensity type may be strongly correlated with femoral head collapse. The GLSZM gray level variance is a radiomics feature that quantifies the distribution of pixel intensities, reflecting the heterogeneity of the texture within the necrotic lesion. This feature likely contributes significantly to the prediction of femoral head collapse due to its sensitivity to tissue density variations within the necrotic lesion. Its high ranking in feature importance analysis by the LightGBM model corroborates its clinical relevance and aligns with observations that texture intensity types are correlated with the risk of collapse. In addition, other texture features also play crucial roles in the model. This suggests a substantial association between the texture of the necrotic lesion and femoral head collapse.

Collectively, the model demonstrates effective predictive performance by analyzing the volume, shape, and internal heterogeneity of the lesions using radiomics. It performs well in classifying cases that exhibit either small and homogeneous lesions or large and highly heterogeneous ones. However, the model’s ability to discern intermediate cases requires further enhancement to improve its predictive accuracy (Fig. 7).

Fig. 7.

Fig. 7

Representative images from three patients in the test set. A-C: A patient correctly predicted by the Lightgbm model, who did not experience collapse; D-F: Another patient correctly predicted by the model, who experienced collapse. G-I: A patient incorrectly predicted by the model, who ultimately experienced collapse

Previous studies using manual assessment methods have established that the extent and internal signal alterations of necrotic lesions correlate with femoral head collapse [33]. The heterogeneity of necrotic lesions may indicate repair processes and structural changes within the lesion, as well as potential risks of femoral head collapse. The automated radiomics analysis pipeline proposed in the current study could objectively and quantitatively capture the heterogeneity of necrotic lesions, enabling a more comprehensive assessment compared to manual methods. The precise correlation between the heterogeneity and femoral head collapse was established in a data-driven manner through machine learning. Result of our study showed that the prediction performance of the LightGBM model was superior to that of the LR model, indicating that the current task could benefit from employing a more advanced machine learning algorithm.

There are several limitations in our study. Firstly, due to the retrospective study design of this study, the risk of selection bias may be unavoidable, and further prospective studies are needed to validate the study. Secondly, to enhance the generalizability of our models, we acquired MRI images from three different scanners. Although we employed extensive preprocessing techniques to normalize the images, there remains a possibility of introducing data heterogeneity. Lastly, while our study utilized multicentric data, there was an uneven gender distribution in the test set, which may have an impact on the model performance. A larger and more diverse sample from additional centers is still needed to validate the robustness of the results.

Conclusion

In conclusion, based on automatic lesion segmentation and radiomics, we have developed and validated a model for prediction of femoral head collapse in patients with ONFH. Our study further elucidates the correlation between the imaging heterogeneity of ONFH lesions and the associated risk of femoral head collapse. Further research is essential to ascertain the clinical applicability of this radiomics-based approach and explore its potential to enhance decision-making in the treatment of ONFH.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1 (29.6KB, docx)
Supplementary Material 2 (297.8KB, pdf)
Supplementary Material 3 (816.8KB, pdf)
Supplementary Material 4 (1.1MB, docx)

Acknowledgements

The authors thank all colleagues who participated in this study.

Abbreviations

ACC

Accuracy

ADASYN

Adaptive Synthetic Sampling

ARCO

Association Research Circulation Osseous

AUC

Area Under the Curve

CLEAR

CheckList for EvaluAtion of Radiomics Research

DSC

Dice Similarity Coefficient

GLDM

Gray Level Dependence Matrix

GLRLM

Gray Level Run Length Matrix

GLSZM

Gray Level Size Zone Matrix

HD

Hausdorff Distance

ICC

Intra-class Correlation Coefficient

JIC

Japanese Investigation Committee

LASSO

Least Absolute Shrinkage and Selection Operator

LightGBM

Light Gradient Boosting Machine

LR

Logistic Regression

METRICS

METhodological RadiomICs Score

MRI

Magnetic Resonance Imaging

NGTDM

Neighboring Gray Tone Difference Matrix

NPV

Negative Predictive Value

ONFH

Osteonecrosis of the Femoral Head

PPV

Positive Predictive Value

ROC

Receiver Operating Characteristic

ROI

Region of Interest

SEN

Sensitivity

SPE

Specificity

TSE

Turbo Spin Echo

Author contributions

Conception and design: SG and WH; Administrative support: WH and YW; Provision of study materials or patients: ZL and WH; Collection and assembly of data: HZ, MW and JP; Data analysis and interpretation: SG, ZL and JP; Manuscript writing: All authors. All authors read and approved the final manuscript.

Funding

This study has received funding by grants from the Social Public Welfare and Basic Research Project of Zhongshan City (Grant Number 2023B3029).

Data availability

The datasets generated and/or analysed during the current study are not publicly available as they have not been deposited into a publicly accessible repository, but are available from the corresponding author on reasonable request.

Declarations

Ethics approval and consent to participate

This study was approved by the ethics committee of the First Affiliated Hospital of Guangzhou University of Chinese Medicine and the Third Affiliated Hospital of Guangzhou University of Chinese Medicine (Reference No: JY-2023-005).Consent to participate was waived due to the retrospective nature of the study design.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Ziqi Li, Email: lzq391@126.com.

Jiewei Peng, Email: zszyypjw@126.com.

References

  • 1.Mont MA, Salem HS, Piuzzi NS, Goodman SB, Jones LC. Nontraumatic osteonecrosis of the femoral head: where do we stand today? A 5-Year update. J Bone Joint Surg Am Vol. 2020;102(12):1084–99. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Kuroda Y, Tanaka T, Miyagawa T, Kawai T, Goto K, Tanaka S, Matsuda S, Akiyama H. Classification of osteonecrosis of the femoral head: who should have surgery? BONE JOINT RES. 2019;8(10):451–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Hindoyan KN, Lieberman JR, Matcuk GR, White EA. A Precise and Reliable Method of determining lesion size in osteonecrosis of the femoral Head using volumes. J Arthroplast. 2020;35(1):285–90. [DOI] [PubMed] [Google Scholar]
  • 4.Lafforgue P, Dahan E, Chagnaud C, Schiano A, Kasbarian M, Acquaviva PC. Early-stage avascular necrosis of the femoral head: MR imaging for prognosis in 31 cases with at least 2 years of follow-up. Radiology. 1993;187(1):199–204. [DOI] [PubMed] [Google Scholar]
  • 5.Sultan AA, Mohamed N, Samuel LT, Chughtai M, Sodhi N, Krebs VE, Stearns KL, Molloy RM, Mont MA. Classification systems of hip osteonecrosis: an updated review. INT ORTHOP. 2019;43(5):1089–95. [DOI] [PubMed] [Google Scholar]
  • 6.Corrias G, Micheletti G, Barberini L, Suri JS, Saba L. Texture analysis imaging what a clinical radiologist needs to know. EUR J RADIOL. 2022;146:110055. [DOI] [PubMed] [Google Scholar]
  • 7.Fournier L, Costaridou L, Bidaut L, Michoux N, Lecouvet FE, de Geus-Oei LF, Boellaard R, Oprea-Lager DE, Obuchowski NA, Caroli A, et al. Incorporating radiomics into clinical trials: expert consensus endorsed by the European Society of Radiology on considerations for data-driven compared to biologically driven quantitative biomarkers. EUR RADIOL. 2021;31(8):6001–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Aerts HJ, Velazquez ER, Leijenaar RT, Parmar C, Grossmann P, Carvalho S, Bussink J, Monshouwer R, Haibe-Kains B, Rietveld D, et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. NAT COMMUN. 2014;5:4006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Autorino R, Gui B, Panza G, Boldrini L, Cusumano D, Russo L, Nardangeli A, Persiani S, Campitelli M, Ferrandina G, et al. Radiomics-based prediction of two-year clinical outcome in locally advanced cervical cancer patients undergoing neoadjuvant chemoradiotherapy. RADIOL MED. 2022;127(5):498–506. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Jiang L, You C, Xiao Y, Wang H, Su GH, Xia BQ, Zheng RC, Zhang DD, Jiang YZ, Gu YJ, et al. Radiogenomic analysis reveals tumor heterogeneity of triple-negative breast cancer. Cell Rep Med. 2022;3(7):100694. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Li Y, Liu Y, Liang Y, Wei R, Zhang W, Yao W, Luo S, Pang X, Wang Y, Jiang X, et al. Radiomics can differentiate high-grade glioma from brain metastasis: a systematic review and meta-analysis. EUR RADIOL. 2022;32(11):8039–51. [DOI] [PubMed] [Google Scholar]
  • 12.Satake H, Ishigaki S, Ito R, Naganawa S. Radiomics in breast MRI: current progress toward clinical application in the era of artificial intelligence. RADIOL MED. 2022;127(1):39–56. [DOI] [PubMed] [Google Scholar]
  • 13.Zhao D, Zhang F, Wang B, Liu B, Li L, Kim SY, Goodman SB, Hernigou P, Cui Q, Lineaweaver WC, et al. Guidelines for clinical diagnosis and treatment of osteonecrosis of the femoral head in adults (2019 version). J Orthop Translat. 2020;21:100–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Chee CG, Kim Y, Kang Y, Lee KJ, Chae HD, Cho J, Nam CM, Choi D, Lee E, Lee JW, et al. Performance of a deep learning algorithm in detecting osteonecrosis of the femoral head on digital radiography: a comparison with assessments by radiologists. AJR Am J Roentgenol. 2019;213(1):155–62. [DOI] [PubMed] [Google Scholar]
  • 15.Klontzas ME, Manikis GC, Nikiforaki K, Vassalou EE, Spanakis K, Stathis I, Kakkos GA, Matthaiou N, Zibis AH, Marias K et al. Radiomics and Machine Learning can differentiate transient osteoporosis from avascular necrosis of the hip. Diagnostics (Basel) 2021, 11(9). [DOI] [PMC free article] [PubMed]
  • 16.Klontzas ME, Stathis I, Spanakis K, Zibis AH, Marias K, Karantanas AH. Deep learning for the Differential diagnosis between transient osteoporosis and avascular necrosis of the hip. Diagnostics (Basel) 2022, 12(8). [DOI] [PMC free article] [PubMed]
  • 17.Klontzas ME, Vassalou EE, Spanakis K, Meurer F, Woertler K, Zibis A, Marias K, Karantanas AH. Deep learning enables the differentiation between early and late stages of hip avascular necrosis. EUR RADIOL. 2024;34(2):1179–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Rakhshankhah N, Abbaszadeh M, Kazemi A, Rezaei SS, Roozpeykar S, Arabfard M. Deep learning approach to femoral AVN detection in digital radiography: differentiating patients and pre-collapse stages. BMC Musculoskelet Disord. 2024;25(1):547. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Ruckli AC, Nanavati AK, Meier MK, Lerch TD, Steppacher SD, Vuilleumier S, Boschung A, Vuillemin N, Tannast M, Siebenrock KA et al. A deep learning method for quantification of femoral Head Necrosis based on routine hip MRI for Improved Surgical decision making. J Pers Med 2023, 13(1). [DOI] [PMC free article] [PubMed]
  • 20.Hines JT, Jo WL, Cui Q, Mont MA, Koo KH, Cheng EY, Goodman SB, Ha YC, Hernigou P, Jones LC, et al. Osteonecrosis of the femoral head: an updated review of ARCO on Pathogenesis, Staging and Treatment. J KOREAN MED SCI. 2021;36(24):e177. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Fedorov A, Beichel R, Kalpathy-Cramer J, Finet J, Fillion-Robin JC, Pujol S, Bauer C, Jennings D, Fennessy F, Sonka M, et al. 3D slicer as an image computing platform for the quantitative Imaging Network. MAGN RESON IMAGING. 2012;30(9):1323–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Ando W, Sakai T, Fukushima W, Kaneuji A, Ueshima K, Yamasaki T, Yamamoto T, Nishii T, Sugano N. Japanese Orthopaedic Association 2019 guidelines for osteonecrosis of the femoral head. J ORTHOP SCI. 2021;26(1):46–68. [DOI] [PubMed] [Google Scholar]
  • 23.Nishii T, Sugano N, Ohzono K, Sakai T, Haraguchi K, Yoshikawa H. Progression and cessation of collapse in osteonecrosis of the femoral head. Clin Orthop Relat Res 2002(400):149–57. [DOI] [PubMed]
  • 24.Wei QS, He MC, He XM, Lin TY, Yang P, Chen ZQ, Zhang QW, He W. Combining frog-leg lateral view may serve as a more sensitive X-ray position in monitoring collapse in osteonecrosis of the femoral head. J Hip Preserv Surg. 2022;9(1):10–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Kocak B, Baessler B, Bakas S, Cuocolo R, Fedorov A, Maier-Hein L, Mercaldo N, Muller H, Orlhac F, Pinto DSD, et al. CheckList for EvaluAtion of Radiomics research (CLEAR): a step-by-step reporting guideline for authors and reviewers endorsed by ESR and EuSoMII. Insights Imaging. 2023;14(1):75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Kocak B, Akinci DT, Mercaldo N, Alberich-Bayarri A, Baessler B, Ambrosini I, Andreychenko AE, Bakas S, Beets-Tan R, Bressem K, et al. METhodological RadiomICs score (METRICS): a quality scoring tool for radiomics research endorsed by EuSoMII. Insights Imaging. 2024;15(1):8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Fu F, Shan Y, Yang G, Zheng C, Zhang M, Rong D, Wang X, Lu J. Deep Learning for Head and Neck CT Angiography: Stenosis and Plaque Classification. RADIOLOGY 2023, 307(3):e220996. [DOI] [PubMed]
  • 28.Maennlin S, Wessling D, Herrmann J, Almansour H, Nickel D, Kannengiesser S, Afat S, Gassenmaier S. Application of deep learning-based super-resolution to T1-weighted postcontrast gradient echo imaging of the chest. RADIOL MED. 2023;128(2):184–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Serafin M, Baldini B, Cabitza F, Carrafiello G, Baselli G, Del FM, Sforza C, Caprioglio A, Tartaglia GM. Accuracy of automated 3D cephalometric landmarks by deep learning algorithms: systematic review and meta-analysis. RADIOL MED. 2023;128(5):544–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Ronneberger O, Fischer P, Brox T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In: 2015-01-01 2015; Cham. Springer International Publishing; 2015. pp. 234–41.
  • 31.Antonelli M, Reinke A, Bakas S, Farahani K, Kopp-Schneider A, Landman BA, Litjens G, Menze B, Ronneberger O, Summers RM, et al. The Medical Segmentation Decathlon. NAT COMMUN. 2022;13(1):4128. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Isensee F, Jaeger PF, Kohl S, Petersen J, Maier-Hein KH. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. NAT METHODS. 2021;18(2):203–11. [DOI] [PubMed] [Google Scholar]
  • 33.Chen L, Hong G, Fang B, Zhou G, Han X, Guan T, He W. Predicting the collapse of the femoral head due to osteonecrosis: from basic methods to application prospects. J Orthop Translat. 2017;11:62–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Steinberg DR, Steinberg ME, Garino JP, Dalinka M, Udupa JK. Determining lesion size in osteonecrosis of the femoral head. J BONE JOINT SURG AM. 2006;88(Suppl 3):27–34. [DOI] [PubMed] [Google Scholar]
  • 35.Kerboul M, Thomine J, Postel M, Merle DR. The conservative surgical treatment of idiopathic aseptic necrosis of the femoral head. J Bone Joint Surg Br. 1974;56(2):291–6. [PubMed] [Google Scholar]
  • 36.Ha YC, Jung WH, Kim JR, Seong NH, Kim SY, Koo KH. Prediction of collapse in femoral head osteonecrosis: a modified Kerboul method with use of magnetic resonance images. J BONE JOINT SURG AM. 2006;88(Suppl 3):35–40. [DOI] [PubMed] [Google Scholar]
  • 37.Wu W, He W, Wei QS, Chen ZQ, Gao DW, Chen P, Zhang QW, Fang B, Chen LL, Li BL. Prognostic analysis of different morphology of the necrotic-viable interface in osteonecrosis of the femoral head. INT ORTHOP. 2018;42(1):133–9. [DOI] [PubMed] [Google Scholar]
  • 38.Zwanenburg A, Vallieres M, Abdalah MA, Aerts H, Andrearczyk V, Apte A, Ashrafinia S, Bakas S, Beukinga RJ, Boellaard R, et al. The image Biomarker Standardization Initiative: standardized quantitative Radiomics for High-Throughput Image-based phenotyping. Radiology. 2020;295(2):328–38. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Shimizu K, Moriya H, Akita T, Sakamoto M, Suguro T. Prediction of collapse with magnetic resonance imaging of avascular necrosis of the femoral head. J Bone Joint Surg Am. 1994;76(2):215–23. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1 (29.6KB, docx)
Supplementary Material 2 (297.8KB, pdf)
Supplementary Material 3 (816.8KB, pdf)
Supplementary Material 4 (1.1MB, docx)

Data Availability Statement

The datasets generated and/or analysed during the current study are not publicly available as they have not been deposited into a publicly accessible repository, but are available from the corresponding author on reasonable request.


Articles from BMC Medical Informatics and Decision Making are provided here courtesy of BMC

RESOURCES