Skip to main content
Journal of Digital Imaging logoLink to Journal of Digital Imaging
. 2023 Jun 2;36(5):2025–2034. doi: 10.1007/s10278-023-00858-1

Automatic Image Segmentation and Grading Diagnosis of Sacroiliitis Associated with AS Using a Deep Convolutional Neural Network on CT Images

Ke Zhang 1,#, Guibo Luo 2,3,#, Wenjuan Li 1,#, Yunfei Zhu 1,#, Jielin Pan 4, Ximeng Li 1, Chaoran Liu 1, Jianchao Liang 4, Yingying Zhan 1, Jing Zheng 5, Shaolin Li 1,, Wenli Cai 2,, Guobin Hong 1,
PMCID: PMC10501961  PMID: 37268841

Abstract

Ankylosing spondylitis (AS) is a chronic inflammatory disease that causes inflammatory low back pain and may even limit activity. The grading diagnosis of sacroiliitis on imaging plays a central role in diagnosing AS. However, the grading diagnosis of sacroiliitis on computed tomography (CT) images is viewer-dependent and may vary between radiologists and medical institutions. In this study, we aimed to develop a fully automatic method to segment sacroiliac joint (SIJ) and further grading diagnose sacroiliitis associated with AS on CT. We studied 435 CT examinations from patients with AS and control at two hospitals. No-new-UNet (nnU-Net) was used to segment the SIJ, and a 3D convolutional neural network (CNN) was used to grade sacroiliitis with a three-class method, using the grading results of three veteran musculoskeletal radiologists as the ground truth. We defined grades 0–I as class 0, grade II as class 1, and grades III–IV as class 2 according to modified New York criteria. nnU-Net segmentation of SIJ achieved Dice, Jaccard, and relative volume difference (RVD) coefficients of 0.915, 0.851, and 0.040 with the validation set, respectively, and 0.889, 0.812, and 0.098 with the test set, respectively. The areas under the curves (AUCs) of classes 0, 1, and 2 using the 3D CNN were 0.91, 0.80, and 0.96 with the validation set, respectively, and 0.94, 0.82, and 0.93 with the test set, respectively. 3D CNN was superior to the junior and senior radiologists in the grading of class 1 for the validation set and inferior to expert for the test set (P < 0.05). The fully automatic method constructed in this study based on a convolutional neural network could be used for SIJ segmentation and then accurately grading and diagnosis of sacroiliitis associated with AS on CT images, especially for class 0 and class 2. The method for class 1 was less effective but still more accurate than that of the senior radiologist.

Keywords: Ankylosing spondylitis, Sacroiliitis, Deep convolutional neural network, Computed tomography, Automatic segmentation

Introduction

Ankylosing spondylitis (AS) is a chronic inflammatory disease mainly involving the sacroiliac joint (SIJ) and spine. The grading diagnosis of sacroiliitis on imaging plays a central role in diagnosing AS according to the modified New York criteria [1, 2]. Conventional radiography (X-ray) is the current first-line imaging method for grading sacroiliitis. But its sensitivity, specificity, and reliability are poor because the pelvic anatomy is complex, and the superposition of bowel gas can hide structural changes [3]. Previous studies have shown that magnetic resonance imaging (MRI) is superior to X-ray in detecting structural lesions in the SIJ, such as erosion, sclerosis, and ankylosis related to sacroiliitis [47]. Nevertheless, even more current MR imaging methods, such as three-dimensional gradient echo (3D-GRE), synthetic computed tomography (CT)/MRI, and zero echo time (ZTE), cannot depict the cortical bony surface directly, thus resulting in misinterpretation of erosions, and sclerosis can mask other critical structural changes related to sacroiliitis [812]. CT has the highest specificity for structural lesions in sacroiliitis compared with MRI and X-ray because of thinner slices and higher spatial resolution. Furthermore, CT is widely available with no absolute contraindications, and the acquisition is fast such that motion artifacts related to patient discomfort are not a problem. Nevertheless, the use of CT in AS has been hindered by its high radiation exposure. Recent technical advances in scanner technology and acquisition protocols have reduced effective radiation close to the level of X-ray [13]. Therefore, CT is a necessary imaging method to improve the diagnostic accuracy of sacroiliitis, especially when MR images are ambiguous [14, 15].

The diagnosis of sacroiliitis on CT is viewer-dependent and may be inconsistent between radiologists and medical institutions. According to the modified New York criteria, sacroiliitis consists of five grades (0–IV), and bilateral grade II–IV or unilateral grade III–IV could be diagnosed as definite sacroiliitis related to AS [2]. In fact, some distinctions between grades I and II or II and III on CT images are difficult and subjective in daily work, especially for radiologists or clinicians not specialized in AS. In addition, many of the signs individually are not specific to sacroiliitis and may also occur in degenerative diseases and other non-AS-related diseases, which need to be identified [16]. Therefore, supportive tools that can help to grade sacroiliitis on CT images in patients with suspected AS are needed. In this context, deep learning methods have the potential to provide supportive tools to radiologists and clinicians [17].

Axial spondyloarthritis (axSpA) can be divided into nonradiographic axSpA (nr-axSpA) and radiographic axSpA (r-axSpA), which is AS according to the Assessment of SpondyloArthritis international Society (ASAS) classification criterion [18]. Subchondral bone marrow edema (BME) in the SIJs on MRI is a required finding of active sacroiliitis of nr-axSpA. The presence of definite radiographic sacroiliitis on X-ray/CT images is needed to classify r-axSpA based on the modified New York criteria [2]. Therefore, different imaging modalities (X-ray, CT, and MR images) are all helpful in diagnosing axSpA. Artificial intelligence (AI) is mainly used in axSpA diagnosis based on these imaging modalities in the existing relevant research. Grauhan et al. indicated that their developed convolutional neural network (CNN) was highly accurate in detecting SIJs on radiographs [19]. Proft et al. proposed using an artificial neural network that could enable the accurate detection (area under the receiver operating characteristic curve (AUC): 0.88) of definite radiographic sacroiliitis relevant for the diagnosis of axSpA on radiographs close to that of expert performance [20]. However, X-ray is of limited value for the early diagnosis of axSpA because it cannot show active lesions and is poor at minor erosion. Inflammatory patterns in the SIJs evident on MRI were described using gray-level, texture, and spectral features, and pattern recognition was performed best using k-nearest neighbors (AUC: 0.96) by Faleiros et al [21]. MRI could potentially be used to diagnose and classify axSpA after the ASAS MRI working group define more active and structural lesions in addition to BME in 2019 [22]. Deep neural networks could detect inflammatory changes (AUC: 0.94) and structural changes (AUC: 0.89) to the SIJ indicative of axSpA on MRI by Bressem et al [23]. Tenorio et al. performed an MRI-based radiomics analysis to associate texture-based biomarkers with sacroiliitis, SpA, and its axial or peripheral subtypes [24]. The clinical radiomics nomogram model showed better efficacy (AUC: 0.90) for differentiating axSpA than that of the radiomics model (AUC: 0.82) by Ye et al [25]. Regardless, conventional MRI may lead to misinterpretation of erosion because it cannot depict the cortical bony surface directly. Also the definition of MRI lesions in SIJ considered highly suggestive of axSpA, such as BME, erosion, and fat lesion, is still being updated [26]. Therefore, the value of previous studies based on the old standards in clinical application remains to be discussed. Although CT cannot be used to detect active lesions, identifying structural lesions, especially erosion, is still the reference standard. Castro-Zunti et al. found that random forest classifier achieved an AUC of 0.97 for erosion vs. young control patients and 0.91 for erosion vs. old control patients. They also found that the deep learning classifier trained without minimizing validation loss achieved an AUC of 0.97 for erosion vs. all control patients [27]. In this study, regions of interest (ROIs) were manually segmented, and the grading or diagnosis of AS was not performed. Shenkman et al. developed a new automatic algorithm for diagnosing and grading sacroiliitis CT scans as incidental findings. The algorithm first computed and refined the ROIs of SIJs using a UNet classifier and a four-tree random forest. Then, each SIJ in each CT slice was graded with the CNN classifier, and finally, sacroiliitis diagnosis and grading were performed by combining the individual slice grades using a random forest. The experimental results yielded binary (healthy and unhealthy) and 3-class (healthy, suspicious, and sick) case classification AUCs of 0.97 and 0.57, respectively [28]. However, only unilateral sacroiliitis was diagnosed and graded in this study; radiographic sacroiliitis relevant to AS was not implemented by combining the unilateral SIJs grades.

In this study, we aimed to develop a fully automatic method based on a deep convolutional neural network to segment SIJ images and then grade and diagnose sacroiliitis associated with AS on CT images.

Materials and Methods

Study Population

This retrospective study was approved by the ethics committees of the hospitals, and informed consent was waived. The data were deidentified to protect the patients’ privacy before use in this study.

This retrospective study used CT data from the Fifth Affiliated Hospital of Sun Yat-sen University and Zhuhai People’s Hospital. Patients suspected of AS were recruited between September 2016 and November 2021 at the Fifth Affiliated Hospital of Sun Yat-sen University and between November 2019 and October 2021 at Zhuhai People’s Hospital. The exclusion criteria were (a) condensing osteitis, infectious sacroiliitis, and primary or secondary tumor on SIJ; (b) poor image quality; and (c) incomplete data labeling. The CT scans from the Fifth Affiliated Hospital of Sun Yat-sen University were randomly divided into a training set (80%, 546 SIJs) and a validation set (20%, 142 SIJs). CT scans from Zhuhai People’s Hospital were all used as the external test set (182 SIJs). Figure 1 shows the flowchart of the data selection process and the creation of the training set, validation set, and external test set.

Fig. 1.

Fig. 1

Flow diagram shows the process of training, validation, and external test sets data selection. AS = ankylosing spondylitis.

CT Image Acquisition

CT examinations were obtained by a 16-MDCT scanner (Somatom Sensation, Siemens Healthineers) and dual-source CT scanner (Somatom Definition Flash, Siemens Healthineers). The 16-MDCT scanning parameters were as follows: tube voltage, 140 kVp; tube current, 42–113 mAs; scan slice thickness, 5 mm; and reconstruction slice thickness, 0.625 mm. The range of the mean volume CT dose index of this protocol was 4.76–12.85 mGy. The dual-source CT scanning parameters were as follows: tube voltage, 120 kVp; tube current, 63–165 mAs; scan slice thickness, 5 mm; and reconstruction slice thickness, 0.75 mm. The range of the mean volume CT dose index of this protocol was 4.3–11.14 mGy.

Workflow of Method

The proposed workflow in this study contained three steps: (1) segmentation of the SIJ using no-new-UNet (nnU-Net), (2) grading of each SIJ using a 3D CNN, and (3) sacroiliitis diagnosis by combining the grading results. Figure 2 shows an overview of our proposed framework. The segmentation and grading steps were developed with PyTorch (v1.4.0).

Fig. 2.

Fig. 2

Overview of our proposed framework. Based on modified New York criteria, we defined grades 0–I as class 0, grade II as class 1, and grades III–IV as class 2 in this study. Grades 0–IV (0, normal, no disease; I, suspicious, some blurring of the SIJ margins; II, mild sclerosis, some erosions; III, partial ankylosis, severe erosions, reduced sacroiliac joint space; IV, complete ankylosis, no sacroiliac joint space) were assessed according to modified New York criteria, and bilateral grade II–IV (class 1–2) or unilateral grade III–IV (class2) were diagnosed as sacroiliitis associated with AS [3].

Segmentation of SIJ

ROIs of the left and right SIJs were first delineated separately by a junior radiologist (with 5 years of experience in imaging diagnosis) on the free-distributed quantitative imaging software 3DQI (https://3dqi.mgh.harvard.edu).

In this study, SIJ segmentation was conducted with the nnU-Net segmentation framework, which is a general biomedical image segmentation method with advanced performance [29]. In the training stage, we selected a volumetric patch size of 288 × 288 × 24 and applied mini-batch optimizer with 300 epochs. The initial learning rate was set to 0.01. Instance normalization was used for each layer of the model, and the loss function was defined as the sum of cross-entropy and Dice losses. Various data augmentation methods were performed to increase the size and diversity of the training set, including random imaging adjustment (noise, blurring, brightness, and contrast) and random image deformation (rotation, shift, shear, and scaling).

In the inference stage, images were predicted with a sliding window strategy, in which the window size was the same as the patch size used during training. Other parameters were all set the same as mentioned. After segmentation, the SIJ regions were divided into left and right regions by the connected domain method. The senior radiologist mentioned above reviewed all the segmented images.

Grading and Diagnosis of Sacroiliitis

The regions outside SIJs were set to 0, and both the left and right SIJs were cropped with the cubic bounding box of the segmented SIJ regions. The whole SIJs were entirely reserved in these ROIs, and most areas outside the SIJs were eliminated. After ROIs of the left SIJs were mirrored horizontally, we unified the resolution of all images. And the cropped regions were resized to a uniform size (120 × 80 × 24).

In the grading step, the regions of SIJs from automatic segmentation were sent to the 3D CNN for analysis. The proposed deep learning model uses a six-layer structure consisting of four CNN blocks, where each CNN block is a stack of 3D convolution, 3D max pooling, batch normalization, and ReLU activation layers. Global average pooling is used at the end of the last convolutional layer to generate the 256-dimensional deep learning features. To reduce the overfitting problem, a dropout layer is inserted between the two fully connected layers, and we adopted various data augmentation random image deformations, such as random cropping, rotation, shifting, and scaling.

Each SIJ was evaluated by three veteran musculoskeletal radiologists (with 20 years of experience in musculoskeletal imaging). Radiologists were blinded to clinical information and evaluated deidentified versions of the CT scans. We defined grades 0–I as class 0, grade II as class 1, and grades III–IV as class 2. Grades 0–IV (0, normal, no disease; I, suspicious, some blurring of the SIJ margins; II, mild sclerosis, some erosions; III, partial ankylosis, severe erosions, reduced sacroiliac joint space; and IV, complete ankylosis, no sacroiliac joint space) were assessed according to modified New York criteria [2]. The ground truth was defined as agreement among all three radiologists. If there were different opinions, radiologists discussed them and reached an agreement. According to the diagnostic criteria of sacroiliitis, bilateral grade II–IV (class 1–2) or unilateral grade III–IV (class 2) was diagnosed as definite sacroiliitis related to AS. The rest were the control group.

Evaluation by Different Seniority Radiologists

To assess the performance of different seniority radiologists, junior, senior, and expert radiologists (with 5, 10, and 20 years of experience in imaging diagnosis, respectively) graded sacroiliitis following the three-class grading standard mentioned above and diagnosed sacroiliitis related to AS based on the grading results.

Performance Evaluation and Statistical Analysis

The nnU-Net and 3D CNN performance was evaluated with the validation and test sets. The automatic segmentation performance of nnU-Net was assessed using the Dice, Jaccard, and relative volume difference (RVD) coefficients. Dice coefficient as a measure of spatial overlap of segmentation and ground truths is defined as 2TP/(FP + 2TP + FN). Jaccard and RVD coefficients are respectively defined as TP/(TP + FP + FN) and (Vol(res)/Vol(ref) − 1). TP, TN, FP, and FN are the numbers of true-positive, true-negative, false-positive, and false-negative, respectively; res, automated segmentation results exported by nnU-Net; and ref, reference segmentation results contoured by radiologists. The sacroiliitis grading performance of the 3D CNN was evaluated by the AUC, accuracy, precision, recall, and F1 score. The diagnostic performance of sacroiliitis was evaluated by the accuracy, sensitivity, and specificity. Accuracy refers to the total correct predictions over all predictions and is defined as TP + TN/(TP + TN + FP + FN). Precision refers to how much of what was predicted as “ture” is actually “ture,” and is defined as TP / (TP + FP). Recall, also called sensitivity, refers to how much of what is actually “ture” was predicted as “ture” and is defined as TP/(TP + FN). F1 score is the harmonic mean of precision and recall and is used to give an estimation of a classifier in a way that balances the two qualities and is defined as 2 × precision × Recall/(precision + recall). Specificity refers to how much of what is actually normal was predicted as normal and is defined as TN/(TN + FP). McNemar’s test was used to compare the grading diagnosis of sacroiliitis between the model and radiologists of varying seniority, as well as the model’s sensitivity (recall), specificity, and precision in different grades. P < 0.05 was considered indicative of a statistically significant difference.

Continuous variables are presented as the mean ± standard deviation, and categorical variables are represented as frequencies. Categorical variables were compared by using the chi-square test. Continuous variables were compared by one-way analysis of variance (ANOVA). These data were analyzed by SPSS (version 26.0; IBM, Armonk, NY). P < 0.05 was considered indicative of a statistically significant difference.

Results

Patient Characteristics

A total of 435 patients (344 patients from the Fifth Affiliated Hospital of Sun Yat-sen University and 91 from Zhuhai People’s Hospital) were eligible for the final analysis. Five hundred forty-six, 142, and 182 SIJs images were, respectively, included in the training, validation, and test sets. The patients were aged from 18 to 68 years. Table 1 lists the clinical characteristics and specific grading for all sets. According to the reference standard, classes 0–2 represented 296 SIJs (54%), 82 SIJs (15%), and 168 SIJs (31%) of the training set, 70 SIJs (49%), 24 SIJs (17%), and 48 SIJs (34%) of the validation set and 42 SIJs (23%), 33 SIJs (18%), and 107 SIJs (59%) in the test set, respectively. Patients with sacroiliitis represented 118 (43%), 36 (51%), and 69 (76%) patients of the training, validation, and test sets, respectively.

Table 1.

Characteristics of patients

Parameters Training set Validation set Test set
Control Sacroiliitis Control Sacroiliitis Control Sacroiliitis
n 155 118 35 36 22 69
Age 36.9 ± 10.1 38.6 ± 11.0 36.4 ± 9.9 37.9 ± 11.3 35.6 ± 9.4 38.2 ± 10.9
Sex
    Male 85 93 17 29 8 52
    Female 70 25 18 7 14 17
Grading
    Class 0 294 2 69 1 37 5
    Class 1 16 66 1 23 7 26
    Class 2 0 168 0 48 0 107

Based on modified New York criteria, we defined grades 0–I as class 0, grade II as class 1, and grades III–IV as class 2

For all three sets, there were no statistically significant differences in age and sex characteristics (P > 0.05). There was no statistically significant difference in the grading of sacroiliitis between the training and validation sets (P > 0.05). In contrast, the test set was significantly different in terms of grading between the training and validation sets (P < 0.05).

Performance of SIJ Segmentation

nnU-Net achieved a Dice coefficient of 0.915, Jaccard coefficient of 0.851, and RVD coefficient of 0.040 in SIJ segmentation for the validation set. For the test set, a Dice coefficient of 0.889, Jaccard coefficient of 0.812, and RVD coefficient of 0.098 were achieved.

Grading and Diagnostic Performance of Sacroiliitis

Compared with those of the ground truth (consensus decision of all three veteran musculoskeletal radiologists), the 3D CNN for the grading of sacroiliitis showed a microaverage AUC of 0.92 and an accuracy of 0.894 for the validation set. For the test set, a microaverage AUC of 0.91 and an accuracy of 0.802 were achieved. The AUCs of classes 0–2 were 0.91, 0.80, and 0.96, respectively, for the validation set and 0.94, 0.82, and 0.93, respectively, for the test set. The lowest AUCs were obtained for class 1 with the validation and test sets. Figure 3 provides confusion matrices of the grading of sacroiliitis for the validation and test sets. Figure 4 shows the receiver operating characteristic (ROC) curves and AUCs of the model performance with the test and validation sets. The recall for class 1 was the lowest for all sets compared to class 2 and 3; the precision for class 2 was higher than class 1 for the test set with 3D CNN (P < 0.05). In addition, the model was superior to the junior and senior radiologists in the grading of class 1 for the validation set and inferior to expert for the test set (P < 0.05). Details on the performance of the model can be obtained from Table 2. The diagnostic performance of sacroiliitis was obtained by combining the unilateral grading results. The 3D CNN had a sensitivity and specificity of 0.917 and 0.943, respectively, for the validation set and a sensitivity of 0.913 and specificity of 0.864 for the test set. However, there were no statistically significant differences between the model and different seniority radiologists. More details are shown in Table 3.

Fig. 3.

Fig. 3

Confusion matrices for the performance of the 3D CNN network in grading of sacroiliitis on the validation (left) and test (right) sets.

Fig. 4.

Fig. 4

Receiver operating characteristic (ROC) curves and associated areas under the receiver operating characteristic curves (AUCs) for 3D CNN network in grading of sacroiliitis on the validation (left) and test (right) sets.

Table 2.

The performance of model and radiologists in grading of sacroiliitis.

Parameters Validation set Test set
Class 0 Class 1 Class 2 Class 0 Class 1 Class 2
Precision
    3D CNN 0.919 0.923 0.855 0.704 0.588 0.883b
    Junior 0.882 0.476a 0.774 0.694 0.385 0.860
    Senior 0.939 0.538a 0.840 0.706 0.435 0.889
    Expert 0.944 0.941 0.887 0.848 0.759 0.944
Recall
    3D CNN 0.971b 0.500 0.979b 0.905b 0.303 0.916b
    Junior 0.857 0.417 0.854 0.810 0.303 0.860
    Senior 0.886 0.583 0.875 0.857 0.303 0.897
    Expert 0.971 0.667 0.979 0.929 0.667a 0.944
F1 score
    3D CNN 0.944 0.649 0.913 0.792 0.400 0.899
    Junior 0.869 0.445 0.812 0.748 0.339 0.860
    Senior 0.912 0.560 0.857 0.774 0.357 0.893
    Expert 0.957 0.781 0.931 0.887 0.710a 0.944
Accuracy
    3D CNN 0.894  0.802
    Junior 0.782  0.747
    Senior 0.831  0.780
    Expert 0.937  0.890

aradiologists with statistical differences from the model in grading

bother classes with statistical differences from class 1

Table 3.

The performance of model and radiologists in diagnosis of sacroiliitis associated with AS

Parameters Validation set Test set
3D CNN Junior Senior Expert 3D CNN Junior Senior Expert
Sensitivity 0.917 0.889 0.944 0.944 0.913 0.899 0.884 0.942
Specificity 0.943 0.829 0.886 0.943 0.864 0.773 0.864 0.909
Accuracy 0.930 0.859 0.915 0.944 0.901 0.868 0.879 0.934
PPV 0.943 0.842 0.895 0.944 0.955 0.925 0.953 0.970
NPV 0.917 0.879 0.939 0.943 0.760 0.708 0.704 0.833

Discussion

The grading and diagnosis of sacroiliitis associated with AS on CT images depend on the experience of radiologists and clinicians. A deep learning method based on a convolutional neural network we developed could achieve SIJ segmentation and accurately grade and diagnose sacroiliitis associated with AS on CT images.

SIJ automatic segmentation on axial CT images is necessary for further grading task because manual delineation may be tedious and subject to interobserver variability, which may affect results. In this study, we employed nnU-Net to segment the SIJ which means “No-new-UNet” is not a new network architecture. nnU-Net achieves its superior performance by systematizing the complex process of manual method configuration, which can automatically configure the preprocessing, network architecture, training, and post-processing for various datasets due to its strong performance and it can be applied without requiring any user intervention. In this study, SIJ automatic segmentation was more accurate than Shenkman et al.’s. This could be because the latter’s segmentation task was more complicated, and its two-third CT scans include the entire body from the neck to the knees [28]. We found that automatic segmentation did not perform appropriately in some cases. There may be a reason that normal SIJs have individual differences and anatomical variations, such as iliosacral complexes, sacral defects, solitary hyperosteogeny, and sacroiliac joint deformities [30, 31].

Castro-Zunti et al. found that the deep learning classifier trained without minimizing validation loss was best and achieves an AUC of 0.97 for erosion vs. control patients on CT [27]. This differs from our paper in that it was a two-class study based on individual scan slices, whereas our paper was a three-class study based on each SIJ. First and foremost, other than erosion, sclerosis and ankylosis should be considered in the three-class grading of SIJ. In our study, class 0 included not only control patients, but also some patients with suspicious lesions. Also erosion may occur in classes 1 and 2, and erosion at the same slice may coexist with other lesions. Therefore, it is difficult to meaningfully compare our work to this research. The proposed solution in Shenkman et al. involved a CNN classifier for grading individual CT scan slices followed by an ensemble of random forest classifiers for grading a patient from all their scan slices [28]. They achieved SIJ two-class classification with an accuracy of 87.1% based on slice three-class grading with an accuracy of 79%. In our study, the accuracy of SIJ three-class grading based on 3D CNN was 0.802, and the diagnosis of sacroiliitis associated with AS was 0.901 after combining the results of unilateral SIJ grading. The diagnosis of sacroiliitis associated with AS finally achieved in our study. 3D CNN model was only used in the grading task, and the classifier was not required for the combination of unilateral SIJ grading results. In this study, we did not use the traditional five-class standard according to the modified New York criteria because of insufficient data and the model’s objective limitation for multi-classification task. The three-class standard in this article cannot only lead to the definite diagnosis of sacroiliitis but also reflect the degree of each SIJ. The distinction between grades III and IV has little value because all progressed to an advanced stage of the disease. The distinction between grade 0 and I is of a certain value because follow-up can be recommended for patients with grade I with suspicious of sacroiliitis. In the case of sufficient samples, grades 0 and I can be distinguished separately in the future. This study also showed that most misdiagnoses in the validation and test sets came from class 1. Almost all the evaluation indicators were the lowest in class 1. There may be several reasons for this: (1) small amount of data (only 15%) of class 1 in the training set. (2) Class 1 is relatively difficult to diagnose according to the clinical experience in daily work, and we could see the grading results of different seniority radiologists. (3) The CT axial images of the SIJ contain more than 100 slices, and the most severe slices determine the final grade. That is, not every slice of the SIJ has the same grade. Class 2 may include the slices of class 0 and class 1, so the impurity of grading may affect the accuracy. In further studies, the model could be trained using a slice-based grading method to improve the accuracy. The slice-based method refers to grading each slice of the SIJ first instead of grading the unilateral SIJ directly.

It should be noted that there were also several limitations in our current study. First, the consensus of three veteran musculoskeletal radiologists was used as the reference standard because there is no gold standard for diagnosing sacroiliitis. However, the consensus of three experts can still be wrong, possibly affecting the model performance. Second, the model’s accuracy in diagnosing class 1 (grade II) was low, consistent with clinical experience. Therefore, to achieve clinical application, the model needs to be optimized by slice-based grading and increasing the sample size in the future. Third, the amount of data was still small. Prospective multicenter studies with considerably large datasets are our future work.

Conclusions

In conclusion, a fully automatic method based on the convolutional neural network we developed could achieve SIJ segmentation and then make an accurate grading diagnosis of sacroiliitis associated with AS on CT images, especially for class 0 (equivalent to grades 0–I of the revised New York criteria, which means no definite sacroiliitis) and class 2 (equivalent to the grades III–IV of revised New York criteria, which means moderate and advanced sacroiliitis). The method was less effective for class 1 (equivalent to grade II of the revised New York criteria, which means mild sacroiliitis) but still more accurate than the senior radiologist. Future research should optimize the model and focus on improving the grading accuracy of class 1 by increasing the sample size and using the slice-based grading method.

Funding

The authors are grateful for the financial support from the National Natural Science Foundation of China (grant nos. 82272104) and the Science and Technology Project in the Social Development Field of Zhuhai City, Guangdong Province, China (grant no. ZH22036201210066PWC).

Data Availability

The datasets generated and analysed during this study are available from the corresponding author on reasonable request.

Declarations

Ethical Approval

This research study was conducted retrospectively from data obtained for clinical purposes. This study was approved by the institutional ethics committee with waiver of informed consent (no. K14-1).

Consent to Participate

Not applicable.

Conflict of Interest

The authors declare no competing interests.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Ke Zhang, Guibo Luo, Wenjuan Li, and Yunfei Zhu contributed equally to this work.

Contributor Information

Ke Zhang, Email: zhangk265@mail2.sysu.edu.cn.

Guibo Luo, Email: luogb@pku.edu.cn.

Wenjuan Li, Email: liwj87@mail.sysu.edu.cn.

Yunfei Zhu, Email: yunfei9832@163.com.

Shaolin Li, Email: lishlin5@mail.sysu.edu.cn.

Wenli Cai, Email: wlcai@hotmail.com.

Guobin Hong, Email: honggb@mail.sysu.edu.cn.

References

  • 1.Klavdianou K, Tsiami S, Baraliakos X. New developments in ankylosing spondylitis-status in 2021. Rheumatology (Oxford) 2022;61(9):3876–3878. doi: 10.1093/rheumatology/keac113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.van der Linden S, Valkenburg HA, Cats A: Evaluation of diagnostic criteria for ankylosing spondylitis. A proposal for modification of the New York criteria. Arthritis Rheum 27:361–368, 1984 [DOI] [PubMed]
  • 3.Christiansen AA, Hendricks O, Kuettel D, et al. Limited reliability of radiographic assessment of Sacroiliac joints in patients with suspected early spondyloarthritis. J Rheumatol. 2017;44:70–77. doi: 10.3899/jrheum.160079. [DOI] [PubMed] [Google Scholar]
  • 4.Bakker PA, van den Berg R, Lenczner G, et al. Can we use structural lesions seen on MRI of the sacroiliac joints reliably for the classification of patients according to the ASAS axial spondyloarthritis criteria? data from the DESIR cohort. Ann Rheum Dis. 2017;76:392–398. doi: 10.1136/annrheumdis-2016-209405. [DOI] [PubMed] [Google Scholar]
  • 5.Diekhoff T, Hermann KG, Greese J, et al. Comparison of MRI with radiography for detecting structural lesions of the sacroiliac joint using CT as standard of reference: results from the SIMACT study. Ann Rheum Dis. 2017;76:1502–1508. doi: 10.1136/annrheumdis-2016-210640. [DOI] [PubMed] [Google Scholar]
  • 6.Ye L, Liu Y, Xiao Q, et al. Mri compared with low-dose CT scanning in the diagnosis of axial spondyloarthritis. Clin Rheumatol. 2020;39:1295–1303. doi: 10.1007/s10067-019-04824-7. [DOI] [PubMed] [Google Scholar]
  • 7.Maksymowych WP, Lambert RG, Østergaard M, et al. Mri lesions in the sacroiliac joints of patients with spondyloarthritis: an update of definitions and validation by the ASAS MRI Working group. Ann Rheum Dis. 2019;78:1550–1558. doi: 10.1136/annrheumdis-2019-215589. [DOI] [PubMed] [Google Scholar]
  • 8.Diekhoff T, Greese J, Sieper J, et al. Improved detection of erosions in the sacroiliac joints on MRI with volumetric interpolated breathhold examination (VibE): results from the SIMACT study. Ann Rheum Dis. 2018;77:1585–1589. doi: 10.1136/annrheumdis-2018-213393. [DOI] [PubMed] [Google Scholar]
  • 9.Deppe D, Hermann K-G, Proft F, et al. CT-like images of the sacroiliac joint generated from MRI using susceptibility-weighted imaging (SWI) in patients with axial spondyloarthritis. RMD Open. 2021;7:e001656. doi: 10.1136/rmdopen-2021-001656. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Jans LBO, Chen M, Elewaut D, et al. MRI-based Synthetic CT in the Detection of Structural Lesions in Patients with Suspected Sacroiliitis: Comparison with MRI. Radiology. 2021;298:343–349. doi: 10.1148/radiol.2020201537. [DOI] [PubMed] [Google Scholar]
  • 11.Li Y, Xiong Y, Hou B, et al. Comparison of zero echo time MRI with T1-weighted fast spin echo for the recognition of sacroiliac joint structural lesions using CT as the reference standard. Eur Radiol. 2022;326:3963–3973. doi: 10.1007/s00330-021-08513-5. [DOI] [PubMed] [Google Scholar]
  • 12.Zhang K, Liu C, Zhu Y, et al. Synthetic MRI in the detection and quantitative evaluation of sacroiliac joint lesions in axial spondyloarthritis. Front Immunol. 2022;13:1000314. doi: 10.3389/fimmu.2022.1000314. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Lambert RGW, Hermann KGA, Diekhoff T. Low-Dose computed tomography for axial spondyloarthritis: update on use and limitations. Curr Opin Rheumatol. 2021;33:326–332. doi: 10.1097/BOR.0000000000000803. [DOI] [PubMed] [Google Scholar]
  • 14.Poddubnyy D, Diekhoff T, Baraliakos X, et al. Diagnostic evaluation of the sacroiliac joints for axial spondyloarthritis: should MRI replace radiography? Ann Rheum Dis. 2022;81:1486–1490. doi: 10.1136/ard-2022-222986. [DOI] [PubMed] [Google Scholar]
  • 15.Diekhoff T, Eshed I, Radny F, et al. Choose wisely: imaging for diagnosis of axial spondyloarthritis. Ann Rheum Dis. 2022;81:237–242. doi: 10.1136/annrheumdis-2021-220136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Poddubnyy D, Weineck H, Diekhoff T, et al. Clinical and imaging characteristics of osteitis condensans ilii as compared with axial spondyloarthritis. Rheumatology. 2020;59:3798–3806. doi: 10.1093/rheumatology/keaa175. [DOI] [PubMed] [Google Scholar]
  • 17.Soffer S, Ben-Cohen A, Shimon O, et al. Convolutional neural networks for radiologic images: a radiologist’s guide. Radiology. 2019;290:590–606. doi: 10.1148/radiol.2018180547. [DOI] [PubMed] [Google Scholar]
  • 18.Sieper J, Rudwaleit M, Baraliakos X, et al: The Assessment of SpondyloArthritis international Society (ASAS) handbook: a guide to assess spondyloarthritis. Ann Rheum Dis 68(suppl 2):ii1-ii44, 2009 [DOI] [PubMed]
  • 19.Nils Friedrich Grauhan, Keno Kyrill Bressem, Yves Nicolas Manzoni, et al: Towards Accurate Detection of Axial Spondyloarthritis by Using Deep Learning to Capture Sacroiliac Joints on Plain Radiographs. Research Square, DOI: 10.21203/rs.3.rs-379664/v1, April 6 2021
  • 20.Proft F, Vahldiek J, Nicolaes J, Tham R, et al: Analysis of the Performance of an Artificial Intelligence Algorithm for the Detection of Radiographic Sacroiliitis in an Independent Cohort of axSpA Patients Including Both Nr-axSpA and r-axSpA [abstract]. Arthritis Rheumatol 74(suppl 9), 2022
  • 21.Faleiros MC, Junior JRF, Zavala EJR, et al: Pattern recognition of inflammatory sacroiliitis in magnetic resonance imaging. European Congress on Computational Methods in Applied Sciences and Engineering 640–644, 2018
  • 22.Maksymowych WP, Lambert RG, Østergaard M, et al. MRI lesions in the sacroiliac joints of patients with spondyloarthritis: an update of definitions and validation by the ASAS MRI working group. Ann Rheum Dis. 2019;78(11):1550–1558. doi: 10.1136/annrheumdis-2019-215589. [DOI] [PubMed] [Google Scholar]
  • 23.Bressem KK, Adams LC, Proft F, et al. Deep Learning Detects Changes Indicative of Axial Spondyloarthritis at MRI of Sacroiliac Joints. Radiology. 2022;305(3):655–665. doi: 10.1148/radiol.212526. [DOI] [PubMed] [Google Scholar]
  • 24.Tenório APM, Faleiros MC, Junior JRF, et al. A study of MRI-based radiomics biomarkers for sacroiliitis and spondyloarthritis. Int J Comput Assist Radiol Surg. 2020;15(10):1737–1748. doi: 10.1007/s11548-020-02219-7. [DOI] [PubMed] [Google Scholar]
  • 25.Ye L, Miao S, Xiao Q, et al. A predictive clinical-radiomics nomogram for diagnosing of axial spondyloarthritis using MRI and clinical risk factors. Rheumatology (Oxford). 2022;61(4):1440–1447. doi: 10.1093/rheumatology/keab542. [DOI] [PubMed] [Google Scholar]
  • 26.Maksymowych WP, Lambert RG, Baraliakos X, et al. Data-driven definitions for active and structural MRI lesions in the sacroiliac joint in spondyloarthritis and their predictive utility. Rheumatology (Oxford) 2021;60(10):4778–4789. doi: 10.1093/rheumatology/keab099. [DOI] [PubMed] [Google Scholar]
  • 27.Castro-Zunti R, Park EH, Choi Y, et al. Early Detection of Ankylosing Spondylitis using Texture Features and Statistical Machine Learning, and Deep Learning, With Some Patient Age Analysis. Comput Med Imaging Graph. 2020;82:101718. doi: 10.1016/j.compmedimag.2020.101718. [DOI] [PubMed] [Google Scholar]
  • 28.Shenkman Y, Qutteineh B, Joskowicz L, et al. Automatic detection and diagnosis of sacroiliitis in CT scans as incidental findings. Med Image Anal. 2019;57:165–175. doi: 10.1016/j.media.2019.07.007. [DOI] [PubMed] [Google Scholar]
  • 29.Isensee F, Jaeger PF, Kohl SAA, et al. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat methods. 2021;18:203–211. doi: 10.1038/s41592-020-01008-z. [DOI] [PubMed] [Google Scholar]
  • 30.Postacchini R, Trasimeni G, Ripani F, et al. Morphometric anatomical and CT study of the human adult sacroiliac region. Surg Radiol Anat. 2017;39:85–94. doi: 10.1007/s00276-016-1703-0. [DOI] [PubMed] [Google Scholar]
  • 31.Egund N, Jurik AG. Anatomy and histology of the sacroiliac joints. Semin Musculoskelet Radiol. 2014;18:332–339. doi: 10.1055/s-0034-1375574. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The datasets generated and analysed during this study are available from the corresponding author on reasonable request.


Articles from Journal of Digital Imaging are provided here courtesy of Springer

RESOURCES