Skip to main content
Journal of Orthopaedics logoLink to Journal of Orthopaedics
. 2023 Mar 1;38:7–13. doi: 10.1016/j.jor.2023.03.001

5-Year progression prediction of endplate defects: Utilizing the EDPP-Flow convolutional neural network based on unbalanced data

Jason Pui Yin Cheung a,b,, Xihe Kuang a,b, Teng Zhang a,b, Kun Wang c,d, Cao Yang c,d
PMCID: PMC9999205  PMID: 36910507

Abstract

Background

Lumbar disc degeneration (LDD) is considered as one of the main causes of low back pain. For clinical diagnosis of LDD, magnetic resonance imaging (MRI) is commonly used. Schmorl's node, high intensity zone (HIZ), Modic changes, and other MRI biomarkers of intervertebral disc (IVD) degeneration are also associated with low back pain. However, the progression and natural history of these features are unclear and there is limited predictive capacity with MRI.

Purpose

We aim to establish and validate a deep learning pipeline, EDPP-Flow, for the 5-year progression prediction of Schmorl's node, HIZ, and Modic changes, based on clinical MRIs.

Materials and methods

An MRI dataset developed on 1152 volunteers was used in this study. For each volunteer, two MRI scans, at baseline and 5-year follow-up, were collected and pathology labels were annotated as present or absent (with/without pathology) by two specialists with over 10 years of clinical experience. Our pipeline contained the published MRI-SegFlow and state-of-the-art convolutional neural network for progression prediction of endplate defects. The label distribution of the dataset is unbalanced, where the number of present samples was much smaller than absent samples. The resampling and data augmentation strategies were adopted to increase the number of present samples in the training process and balance the influence of different samples on the model, which can improve the prediction accuracy.

Results

Our pipeline achieved high weighted accuracy, sensitivity, and specificity for progression prediction of Schmorl's node (89.46 ± 3.71%, 89.19 ± 2.70%, 89.72 ± 2.42%), HIZ (91.75 ± 2.48%, 93.07 ± 3.96%, 90.43 ± 2.51%), and Modic changes (87.51 ± 2.23%, 87.93 ± 1.72%, 87.10 ± 1.99%), on the unbalanced dataset (present sample's percentages of the 3 pathologies above were 4.3%, 11.7%, and 6.7%).

Conclusion

We developed and validated a deep learning pipeline, for the progression prediction of endplate defects, which showed high prediction accuracy on unbalanced data. The method has significant potential for clinical implementation.

Keywords: Lumbar disc degeneration, Disease progression prediction, Deep learning, Convolutional neural network, Unbalanced data

Abbreviations

ACC

Accuracy

BMI

Body mass index

CNN

Convolutional neural network

HIZ

High intensity zone

IVD

Intervertebral disc

LDD

lumbar disc degeneration

MCs

Modic changes

MRI

Magnetic resonance imaging

NPV

Negative predictive value

PPV

Positive predictive value

SD

Standard deviation

VB

Vertebral body

1. Introduction

Magnetic resonance imaging (MRI) is commonly used in the clinical diagnosis of lumbar disc degeneration (LDD). Several MRI features of LDD have been identified including Schmorl's nodes,1 high intensity zones (HIZs),2 and Modic changes (MCs).3 Schmorl's nodes are associated with LBP and endplate signal changes.4 HIZs are considered as diagnostic biomarkers for LBP based on concordant pain responses via discograms.2,5, 6, 7 HIZs consist of fluid-filled areas at a detached nucleus pulposus between the lamellae of a torn annulus fibrosus leading to granulation tissue formation with secondary inflammation and neovascularization.8,9 Pro-inflammatory cytokines and mediators come into play with the sensitization of spinal nerve nociceptors leading to LBP. Thus, HIZs are found to be strongly associated with LDD.10,11 This is a finding corroborated by Teraguchi et al.12 in a population-based cohort. MCs are signal changes in the bone marrow and endplate that are visible on MRIs and are also reported to be risk factors for spinal degeneration.13,14 With the increased attention on personalized healthcare, it is important to determine the profile of spinal imaging phenotypes. This allows better identification of sources of LBP and this assists in precision medicine to enhance patient outcomes. Although we have the means to correctly identify and classify various MRI phenotypes, we still do not have adequate capacity for predicting the longitudinal changes with these LDD features. This is the limitation of cross-sectional assessments of MRIs without the foresight of future progression.

Convolutional neural network (CNN) is a potential tool for bridging this knowledge gap. CNN can learn the underlying image patterns associated with a specific pathology from a large amount of training data,15 and can automatically extract hierarchical features from raw images. CNN is a universal technology, which has achieved remarkable performance in multiple MRI assessment tasks, including pathology identification,16, 17, 18 key-point detection,19,20 and anatomical segmentation.21, 22, 23 Currently, there is very limited work utilizing CNN for predicting longitudinal changes of LDD pathology due to the lack of large-scale labelled datasets with follow-up for the training and validation of a CNN model.

In this study, we aim to establish and validate a deep learning pipeline, called EDPP-Flow, for the 5-year progression prediction of longitudinal changes in Schmorl's nodes, HIZs, and MCs, based on clinical MRIs. The objectives include 1) organizing a large-scale MRI dataset with pathology labels and follow-up; 2) establishing a CNN based progression prediction pipeline; 3) quantitatively evaluating the prediction performance of the pipeline.

2. Materials and Methods

2.1. Dataset

The dataset was developed on 1152 volunteers,24 who were over 18 years old without any previous surgical treatment on the spine, marked spinal deformities, or spinal tumours. All volunteers had signed the written consent and the study was ethically approved by the local institutional review board (UW 14–138). For each volunteer, the demographic data (age, height, weight, and body mass index (BMI)) and two MRI scans, including the baseline and follow-up, were collected. There was an interval of 5 years (within 6 months deviation) between the baseline and follow-up scans. Each MRI scan contained at least five IVDs from L1 to S1, thus the dataset in total consisted of 2304 MRI scans, including 25344 images. Information regarding the subject's age, height, weight, and body mass index (BMI) were recorded and reported as mean ± standard deviation (SD).

The MRI scans of the dataset were 1.5T HD sagittal lumbar T2-weighted MRIs, which were collected from three different institutions with different equipment and protocols. For MRI scans, the image resolution ranged from 448 × 448 to 512 × 512, slice number ranged from 11 to 17, slice thickness ranged from 4 mm to 5 mm, pixel spacing ranged from 0.66 mm to 0.68 mm, repetition time ranged from 3000 ms to 3680 ms, and echo time ranged from 81 ms to 111 ms. The diversity of the dataset may increase the robustness of our CNN model against the variation of image quality.

2.2. Pathology labels

Three different pathologies of LDD, including Schmorl's nodes, HIZs, and MCs, for each IVD, were assessed by two experienced (>10 years) spine specialists, who annotated all MRI scans (baseline and follow-up) of the dataset. Schmorl's nodes are characterized as a localized defect at the rostral, caudal or both endplates. They appear as an indentation/herniation into the vertebral body with or without a sclerotic rim surrounding the indentation. HIZs are bright white signals located in the substance of the annulus fibrosus which is differentiated from the signal of the nucleus pulposus and is surrounded by the low-intensity (black) signal of the annulus fibrosus and is usually of similar brightness as the cerebrospinal fluid of the same level. MCs are defined as high-signal intensity at the vertebral endplates. Since only T2W MRIs were used, we only identified whether there was an MC rather than the type of MC.

For Schmorl's nodes, each IVD was annotated as 1/present (presence of a Schmorl's node at the IVD either on upper or lower or both vertebral endplates) or 0/absent (no Schmorl's nodes). Similarly, for HIZs and MCs, each IVD was also annotated as 1 or 0, representing the presence and absence of corresponding pathologies, respectively. The pathology label of each IVD was determined as the consistent assessment results of two spine surgeons, and inconsistent results were further discussed with a senior surgeon for the final agreement.

2.3. EDPP-Flow

The EDPP-Flow is illustrated in Fig. 1. First, the MRI-SegFlow,25 which was an unsupervised MRI segmentation method, was adopted to generate the soft segmentation of the vertebral body (VB) and IVD in the baseline lumbar MRIs. The soft segmentation indicated the probability that each pixel in MRI belonged to VB and IVD. Based on the soft segmentation, the position and dimension of each IVD were determined. Instead of using the whole MRI scan, only the local region of each IVD was inputted into the CNN model, to ensure the model focused on the image features related to the IVD. The IVD region was a cuboid centring at the corresponding IVD, which covered the whole IVD and surrounding tissues, such as vertebral endplates, with the shape of 1.5w × 2w × n, where w represented the width of the IVD, n was the slice number of the MRI scan. The IVD regions were extracted from the baseline MRI scan and two soft segmentation results, which were resized to a standard size and combined as the input of a CNN model. The CNN model of our pipeline adopted the basic architecture of VGG-M26 and was modified to assess the MRI scans, which consisted of an encoder and a classifier. The encoder extracted the image features from the input IVD regions, which adopted sequential convolution layers and max-pooling layers. Based on the image features, the classifier, which consisted of two fully connected layers, produced a probability that the follow-up pathology of the input IVD region would be present. If the probability was higher than a threshold, the follow-up pathology was determined as present. If the baseline pathology is absent and the follow-up is predicted as present, the sample is determined as progress, otherwise the sample is non-progress. The resampling and data augmentation strategies were adopted in the training process to alleviate the unbalanced data and overfitting issue. The details of network architecture, training protocol, and implementation are presented in the Appendix.

Fig. 1.

Fig. 1

The overall framework of EDPP-Flow. Processing A represents the published MRI-SegFlow, which generates the soft segmentation of IVD and VB.

2.4. Evaluation

The dataset was divided into a training set and a testing set, with a sample number ratio of 0.85:0.15. The training set was used for the training of the CNN model and optimization of hyperparameters. The trained and optimized pipeline was tested on the testing set. The testing result was the final evaluation of the pipeline's prediction performance.

Based on the pathology label and pipeline prediction result, the samples of the testing set were divided into four categories, including true positive (TP), true negative (TN), false positive (FP) and false negative (FN). TP and TN were defined as the progress and non-progress samples predicted correctly by the pipeline. FP was the non-progress sample predicted as progress, while FN represented the progress sample predicted as non-progress. Further, five metrics were adopted for the quantitative evaluation of prediction performance of our pipeline, including weighted accuracy (wAcc), sensitivity (Sen), specificity (Spe), weighted positive predictive value (wPPV), and weighted negative predictive value (wNPV), which were defined as:

wAcc=wp*NTP+wn*NTNwp*(NTP+NFN)+wn*(NTN+NFP)
Sen=NTPNTP+NFN
Spe=NTNNTN+NFP
wPPV=wp*NTPwp*NTP+wn*NFP
wNPV=wn*NTNwn*NTN+wp*NFN

where NTP, NTN, NFP and NFN represented as the number of TP, TN, FP, FN samples, wp and wn were the weight for the progress and non-progress samples respectively, which were defined as:

wp=NTP+NTN+NFP+NFNNTP+NFN
wn=NTP+NTN+NFP+NFNNTN+NFP

The wAcc represented the overall performance of progression prediction. The Sen and wPPV represented the pipeline's ability on identifying progress samples, while the Spe and wNPV represented the ability on detecting non-progress samples.

3. Results

The volunteers involved in this study, who had a mean age of 41.4 years with an SD of 10.5 (Table 1), and the main age group was 40–50 years (42.8%). The mean height of volunteers was 1.63 m with an SD of 0.09, and the mean weight was 61.1 kg with an SD of 11.4. The mean BMI was 22.9 kg/m2 with an SD of 3.47, and most participants had a BMI of 18.5–25 (68.63%). Table 2 presented the pathology label distribution of the dataset for training and testing of the CNN based pipeline, which indicated the unbalanced data issue, especially for Schmorl's nodes (4.3% present and 95.7% absent) and MCs (6.7% present and 93.3% absent), whose present samples were much less than absent ones.

Table 1.

Demographics of dataset.

Age Group (years) 18 to 40 40 to 50 over 50
Proportion 36.5% 42.8% 20.7%
Gender Male Female
Proportion 40.2% 59.8%
BMI (kg/m2) under 18.5 18.5 to 25.0 over 25.0
Proportion 7.3% 68.6% 24.1%

Table 2.

Pathology label distribution.

Schmorl's Nodes
Pathology Label 1/Present 0/Absent
Proportion 4.3% 95.7%
High-intensity Zones
Pathology Label 1/Present 0/Absent
Proportion 11.7% 88.3%
Modic Changes
Pathology Label 1/Present 0/Absent
Proportion 6.7% 93.3%

The experimental configuration and implementation details for the validation of our new prediction pipeline are described in the Appendix. Totally 864 IVD samples were used for the testing process. The confusion matrixes of the follow-up pathology prediction (Fig. 2) illustrate the number of present and absent samples and CNN prediction result. Further, according to the baseline pathology label, the evaluation metrics for progression prediction including wAcc, Sen, Spe, wPPV and wNPV were computed (Table 3). Our CNN based pipeline achieved remarkable performance on HIZs (wAcc: 91.75 ± 2.48%, Sen: 93.07± 3.96%, Spe: 90.43± 2.51%, wPPV: 90.68± 1.19, wNPV: 92.88± 3.87). For Schmorl's nodes and MCs, despite serious data unbalanced conditions, our method still achieved high prediction accuracy (the wAcc, Sen, Spe, wPPV and wNPV were 89.46± 3.71%, 89.19± 2.70%, 89.72± 2.42%, 89.67± 1.20%, and 89.25± 4.95% for Schmorl's nodes, and 87.51± 2.23%, 87.93± 1.72%, 87.10± 1.99%, 87.20± 1.55, and 87.83± 1.23 for MCs). The probability density histogram of the CNN model (Fig. 3) recorded the number of samples with different present probabilities predicted by the CNN model. For all three pathologies, our model could generate relatively high probabilities for present samples and low probabilities for absent ones.

Fig. 2.

Fig. 2

The confusion matrixes of follow-up pathology prediction results for Schmorl's nodes, high-intensity zones, and Modic changes.

Table 3.

Evaluation metrics of progression prediction results.

Pathology wAcc (%) Sen (%) Spe (%) wPPV (%) wNPV (%)
Schmorl's Nodes 89.46 ± 3.71 89.19 ± 2.70 89.72 ± 2.42 89.67 ± 1.20 89.25 ± 4.95
High-intensity zones 91.75 ± 2.48 93.07 ± 3.96 90.43 ± 2.51 90.68 ± 1.19 92.88 ± 3.87
Modic Changes 87.51 ± 2.23 87.93 ± 1.72 87.10 ± 1.99 87.20 ± 1.55 87.83 ± 1.23

wAcc: weighted accuracy; Sen: sensitivity; Spe: specificity; wPPV: weighted positive predictive value; wNPV: weighted negative predictive value.

Fig. 3.

Fig. 3

The probability density histogram of the CNN model output. (a), (b) and (c) represent the Schmorl's nodes, High-intensity zones, and Modic Changes, respectively. The red and blue bars represent the model output of present and absent samples.

4. Discussion

We developed a CNN based pipeline, EDPP-Flow, for the progression prediction of endplate defects, which contains the published MRI-SegFlow and the state-of-the-art CNN model. The soft segmentation results generated by the MRI-SegFlow were combined with the original sagittal lumbar MRI and served as the input of the CNN model, which can predict the follow-up pathology status for each IVD. According to the baseline pathology, the pipeline can generate the disease progression prediction. Compared with the conventional machine learning method, the CNN model can extract and utilize the hierarchical and abstract features, which enables the pipeline to identify some complex and unintuitive image patterns associated with the pathology progression. The probability density histogram of the CNN model (Fig. 3) demonstrated the ability of our CNN model in discriminating different samples. For most present samples, the CNN model could output a high probability, and for most absent samples the outputs were significantly lower. Besides, the underlying mechanism of the CNN model was fully mathematical without any random factors, therefore our pipeline could generate a highly consistent prediction result. This means our model will always provide the same prediction result when facing the input with the same image feature. With resampling and data augmentation strategies, the CNN model could be trained with highly unbalanced training data. Our method carries great potential for early detection of multiple LDD pathologies, which has clinical implications in decision-making options.

In LDD, the roles of SN and HIZ cannot be understated. There are two proposed pathways of disc degeneration namely ‘annulus driven degeneration’ and ‘endplate driven degeneration’27, represented by HIZ and SN on MRI respectively. HIZs are conventionally observed to be manifestations of early DD28, 29, 30, 31, and are more prevalent in older patients.32,33 There is a role for SN in the formation of HIZ. The endplate structural defects reduces the pressure in the nucleus of adjacent discs, and transfers the stress to the posterior annulus.34 Repetitive loading during bending and compression can cause the nucleus to herniate into the stretched annulus.34,35 Progression will lead to back pain and canal compromise.36, 37, 38, 39, 40, 41

Our dataset was established based on a population cohort, and the volunteers were recruited by open invitation, therefore the dataset should naturally reflect the true pathology distribution among the general population. However, the label distribution was highly unbalanced for all pathologies involved in this study (the ratios of present samples were 4.28%, 11.69%, and 6.71% for Schmorl's nodes, HIZs and MCs respectively). This may reduce the reliability of the CNN model.42, 43, 44 The CNN model is data driven, which will learn the underlying data structure and distribution from the training data. As the number of absent samples in the training data is much more than present samples, the CNN model will tend to predict an unseen sample as absent.45,46 Therefore, the ability of the CNN model on identifying the present sample will be reduced.

We adopted a resampling and data augmentation strategy in the training process of the CNN model to alleviate the effect of unbalanced data.47,48 The detailed protocol of resampling is presented in the Appendix, which equalizes the number of present and absent samples in the training data. The essential mechanism of resampling is the repetition of the present sample, which may increase the risk of overfitting. The CNN model may remember some noise or artefact in the image as meaningful features, due to its repeated appearance, which will reduce the model's generalizability and prediction accuracy when facing the unseen case.49,50 We adopted data augmentation to introduce the feature variation in the repeated present samples of the training data, which can effectively reduce the risk of overfitting. The detailed implementation of data augmentation is also presented in the Appendix. With the resampling and data augmentation, our pipeline showed high robustness against the unbalanced data. For all pathologies, our method achieved remarkable performance in identifying both present and absent samples (Fig. 2, Fig. 3).

In this study, we introduced the weight of different samples in the calculation of accuracy (ACC), positive predictive value (PPV) and negative predictive value (NPV), and defined wACC, wPPV and wNPV for the evaluation of our pipeline, because they can more reliably reflect the true performance of the pipeline on the unbalanced testing data. Since the non-progress samples account for a very large part of testing data, they will dominate the calculation of ACC, PPV, and NPV. The CNN model can still achieve high ACC even when it performs poorly on the prediction of progress samples. The wACC, wPPV, and wNPV assigned more weight to progress samples to make them as important in the calculation as non-progress samples. Since the calculation of sensitivity or specificity only involves one kind of sample, which is not affected by unbalanced data, the weight is not introduced in these two metrics.

There are invariably some limitations to report. The CNN model is data-driven and is sensitive to the quality of input data, which may reduce the model's performance when analysing the image with different image quality caused by variations in equipment and MRI protocols. Besides, our CNN model was developed based on one cohort, and its performance on other populations requires further external validation. In future studies, our dataset will be further enriched with different MRI image qualities and from different populations for further development and validation of our CNN model. The AI interpretation work will also be conducted to analyse the underlying mechanism of our CNN model and associate the specific image features with underlying pathologies.

5. Conclusion

We have developed and validated a CNN based pipeline, EDPP-Flow, for the progression prediction of endplate defects, which can predict the disease progression in 5 years based on a baseline lumbar MRI. The resampling and data augmentation strategies are adopted in the training process of the CNN model to alleviate the unbalanced data issue and improve prediction accuracy. A large MRI dataset with follow-up and pathology labels annotated by specialists was developed for the training and testing of our pipeline. The validation result demonstrates that our pipeline has achieved high prediction accuracy for Schmorl's nodes, HIZs, and MCs. Our pipeline shows high robustness against unbalanced label distribution, which maintains high sensitivity to the present samples even when they are rare in the training data. The method has significant potential for clinical implementation. In future work, our pipeline will be validated on other populations, and the interpretation work of CNN model will be conducted to further identify and assess the specific image feature associated with underlying pathologies.

Funding/sponsorship

This study was supported by the Mid-stream Research Programme for Universities (MRP) by the Innovation and Technology Fund (MRP/038/20X).

Informed consent (patient/guardian), mandatory only for case reports/clinical images

Not applicable.

Institutional ethical committee approval

Ethics approval was obtained from the Institutional Review Board of the University of Hong Kong/Hospital Authority Hong Kong West Cluster (UW 14–138).

Authors contribution

Jason Pui Yin CHEUNG: Conceptualization, data curation, formal analysis, funding acquisition, investigation, methodology, project administration, resources, software, supervision, validation, visualization, writing-original draft, writing-review and editing.

Xihe KUANG: Data curation, formal analysis, investigation, methodology, software, validation, visualization, writing-original draft.

Teng ZHANG: Conceptualization, formal analysis, funding acquisition, investigation, methodology, project administration, resources, software, supervision, validation, visualization, writing-review and editing.

Kun WANG: Investigation, methodology, writing-review and editing.

Cao YANG: Investigation, methodology, validation, visualization, writing-review and editing.

Role of the funding source

The funding was used for supporting research staff. The funding was not involved in the study design, collection, analysis and interpretation of data, and writing of report and decision to submit the article for publications.

Declaration of competing interest

None.

Acknowledgement

None.

Appendix.

Network Architecture

The Convolutional Neural Network (CNN) model of our pipeline adopts the basic architecture of VGG-M and is modified for the assessment of MRI scans, which consists of an encoder and a classifier. The encoder can extract the image features from the input data with a sequence of convolutional layers and max-pooling layers. For each IVD, all three IVD regions (from MRI scan, soft segmentation of IVD, and soft segmentation of VB) are resized to a standard size of 150 × 200 × 9, and concatenated together to form a 150 × 200 × 27 3D volume, which is served as the input of our CNN model. The input volume is first processed by a convolutional layer with the kernel number of 64, kernel size of 7 × 7, and stride of 2 × 2, followed by a 2 × 2 max-pooling layer, which produces the first level feature map with the shape of 36 × 48 × 64. The feature map is further processed by a convolutional layer with the kernel number of 128, kernel size of 5 × 5, and stride of 2 × 2, followed by a 2 × 2 max-pooling layer, which generates the second level feature map with the shape of 8 × 11 × 128. Finally, the second level feature map is processed by three convolutional layers with the kernel number of 256, kernel size of 3 × 3, and stride of 1 × 1, followed by a 2 × 2 max-pooling layer, which produces the final feature map with the shape of 1 × 2 × 256. The extracted image features are further assessed by the classifier of the CNN model, which contains two fully connected layers with 512 hidden units, to generate a probability that the follow-up pathology of the input IVD region would be present. All convolutional layers and fully connected layers of the CNN model are activated with the Rectified Linear Unit (ReLU), and the output is activated with sigmoid function.

Training protocol

The resampling and data augmentation are adopted in the training process of the CNN model to alleviate the unbalanced data and overfitting issues. Assuming the raw training dataset consists of M present samples and N absent samples (M « N). For each epoch of training, the resampling process first copies the present samples for k times, where k is a predefined multiplication factor and k × M < N. Then k × M absent samples are randomly selected from the raw training dataset, which are combined with the copied present samples to form a resampled dataset (totally 2 × k × M samples) for the training of the CNN model. In the resampled dataset the numbers of present and absent samples are the same. The data augmentation is adopted to introduce the feature variation to the copied present samples to reduce the risk of overfitting, which contains several random spatial manipulations of images, such as translation with random distance, rotation with random angle, and rescaling with random ratio. The data augmentation is performed after copying the present sample. For each sample, there may be multiple manipulations applied simultaneously or no manipulation used. The model is first pre-trained with a pathology classification task, where the model is trained to identify the baseline pathology based on the baseline image. The pre-trained model is further finetuned with the pathology prediction task, where the model is trained to predict the follow-up pathology from baseline image. The classifier of the model is reinitialized in finetuning, and the parameters of encoder are inherited.

Implementation

5760 IVD samples are used for the development and validation of our CNN based pipeline. The IVD samples are divided into a training set and a testing set, with the sample numbers of 4896 and 864, respectively. The multiplication factors for Schmorl's nodes, High intensity zones (HIZs), and Modic Changes (MCs) are 15, 6, and 10. The data augmentation includes 1) translation with the distance of [−15, +15] pixels, 2) rotation with the angle of [−5, +5] degree, and 3) rescaling with the ratio of [0.9, 1.1]. For the training process of the CNN model, mini-batch strategy is adopted with the batch size of 32. The Adam is adopted as the optimizer for training, with the learning rate of 0.00001. The binary cross entropy is adopted as the loss function. The thresholding values of model output for determining the pathology prediction are 0.3, 0.6 and 0.5 for Schmorl's nodes, HIZs, and MCs, respectively. TensorFlow 2.0 was used to implement the model with NVIDIA 2080Ti.

References

  • 1.Takahashi K., Miyazaki T., Ohnari H., Takino T., Tomita K. Schmorl's nodes and low-back pain. Eur Spine J. 1995;4(1):56–59. doi: 10.1007/BF00298420. [DOI] [PubMed] [Google Scholar]
  • 2.Aprill C., Bogduk N. High-intensity zone: a diagnostic sign of painful lumbar disc on magnetic resonance imaging. Br J Radiol. 1992;65(773):361–369. doi: 10.1259/0007-1285-65-773-361. [DOI] [PubMed] [Google Scholar]
  • 3.Roemer F.W., Guermazi A., Javaid M.K., et al. Change in MRI-detected subchondral bone marrow lesions is associated with cartilage loss: the MOST Study. A longitudinal multicentre study of knee osteoarthritis. Ann Rheum Dis. 2009;68(9):1461–1465. doi: 10.1136/ard.2008.096834. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Teraguchi M., Yoshimura N., Hashizume H., et al. The association of combination of disc degeneration, end plate signal change, and Schmorl node with low back pain in a large population study: the Wakayama Spine Study. Spine J. 2015;15(4):622–628. doi: 10.1016/j.spinee.2014.11.012. [DOI] [PubMed] [Google Scholar]
  • 5.Chen J-y, Ding Y., Lv R-y, et al. Correlation between MR imaging and discography with provocative concordant pain in patients with low back pain. Clin J Pain. 2011;27(2):125–130. doi: 10.1097/ajp.0b013e3181fb2203. [DOI] [PubMed] [Google Scholar]
  • 6.Lam K.S., Carlin D., Mulholland R.C. Lumbar disc high-intensity zone: the value and significance of provocative discography in the determination of the discogenic pain source. Eur Spine J. 2000;9(1):36–41. doi: 10.1007/s005860050006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Schellhas K.P., Pollei S.R., Gundry C.R., Heithoff K.B. Lumbar disc high-intensity zone: correlation of magnetic resonance imaging and discography. Spine. 1996;21(1):79–86. doi: 10.1097/00007632-199601010-00018. [DOI] [PubMed] [Google Scholar]
  • 8.Ricketson R., Simmons J.W., Hauser B.O. The prolapsed intervertebral disc: the high-intensity zone with discography correlation. Spine. 1996;21(23):2758–2762. doi: 10.1097/00007632-199612010-00010. [DOI] [PubMed] [Google Scholar]
  • 9.Yu S.W., Haughton V.M., Sether L.A., Wagner M. Comparison of MR and diskography in detecting radial tears of the anulus: a postmortem study. Am J Neuroradiol. 1989;10(5):1077–1081. [PMC free article] [PubMed] [Google Scholar]
  • 10.Teraguchi M., Yim R., Cheung J.P.-Y., Samartzis D. The association of high-intensity zones on MRI and low back pain: a systematic review. Scoliosis and spinal disorders. 2018;13(1):1–8. doi: 10.1186/s13013-018-0168-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Teraguchi M., Cheung J.P.Y., Karppinen J., et al. Lumbar high-intensity zones on MRI: imaging biomarkers for severe, prolonged low back pain and sciatica in a population-based cohort. Spine J. 2020;20(7):1025–1034. doi: 10.1016/j.spinee.2020.02.015. [DOI] [PubMed] [Google Scholar]
  • 12.Teraguchi M., Samartzis D., Hashizume H., et al. Classification of high intensity zones of the lumbar spine and their association with other spinal MRI phenotypes: the Wakayama Spine Study. PLoS One. 2016;11(9) doi: 10.1371/journal.pone.0160111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Zehra U., Cheung J.P.Y., Bow C., Lu W., Samartzis D. Multidimensional vertebral endplate defects are associated with disc degeneration, modic changes, facet joint abnormalities, and pain. J Orthop Res. 2019;37(5):1080–1089. doi: 10.1002/jor.24195. [DOI] [PubMed] [Google Scholar]
  • 14.Zehra U., Cheung J.P.Y., Bow C., et al. Spinopelvic alignment predicts disc calcification, displacement, and Modic changes: evidence of an evolutionary etiology for clinically‐relevant spinal phenotypes. JOR spine. 2020;3(1) doi: 10.1002/jsp2.1083. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.LeCun Y., Bengio Y., Hinton G. Deep Learn Nat. 2015;521(7553):436–444. doi: 10.1038/nature14539. [DOI] [PubMed] [Google Scholar]
  • 16.Jamaludin A., Kadir T., Zisserman A. SpineNet: automated classification and evidence visualization in spinal MRIs. Med Image Anal. 2017;41:63–73. doi: 10.1016/j.media.2017.07.002. [DOI] [PubMed] [Google Scholar]
  • 17.Lootus M., Kadir T., Zisserman A. Automated radiological grading of spinal MRI. Recent Adv Comput Methd Clin Appl Spine Image. 2015:119–130. Springer. [Google Scholar]
  • 18.Cheung J.P.Y., Cheung P.W.H., Cheung A.Y.L., Lui D., Cheung K.M.C. Comparable clinical and radiological outcomes between skipped-level and all-level plating for open-door laminoplasty. Eur Spine J. 2018;27(6):1365–1374. doi: 10.1007/s00586-018-5533-0. [DOI] [PubMed] [Google Scholar]
  • 19.Mader A.O., Lorenz C., Meyer C., editors. A General Framework for Localizing and Locally Segmenting Correlated Objects: A Case Study on Intervertebral Discs in Multi-Modality MR Images. Springer; 2020. [Google Scholar]
  • 20.Rouhier L., Romero F.P., Cohen J.P., Cohen-Adad J. Spine intervertebral disc labeling using a fully convolutional redundant counting model. arXiv preprint arXiv:200304387. 2020 [Google Scholar]
  • 21.Gros C., De Leener B., Badji A., et al. Automatic segmentation of the spinal cord and intramedullary multiple sclerosis lesions with convolutional neural networks. Neuroimage. 2019;184:901–915. doi: 10.1016/j.neuroimage.2018.09.081. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Li X., Dou Q., Chen H., et al. 3D multi-scale FCN with random modality voxel dropout learning for intervertebral disc localization and segmentation from multi-modality MR images. Med Image Anal. 2018;45:41–54. doi: 10.1016/j.media.2018.01.004. [DOI] [PubMed] [Google Scholar]
  • 23.Perone C.S., Calabrese E., Cohen-Adad J. Spinal cord gray matter segmentation using deep dilated convolutions. Sci Rep. 2018;8(1):1–13. doi: 10.1038/s41598-018-24304-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Samartzis D., Karppinen J., Chan D., Luk K.D.K., Cheung K.M.C. The association of lumbar intervertebral disc degeneration on magnetic resonance imaging with body mass index in overweight and obese adults: a population‐based study. Arthritis Rheum. 2012;64(5):1488–1496. doi: 10.1002/art.33462. [DOI] [PubMed] [Google Scholar]
  • 25.Kuang X., Cheung J.P.Y., Wu H., Dokos S., Zhang T., editors. MRI-SegFlow: A Novel Unsupervised Deep Learning Pipeline Enabling Accurate Vertebral Segmentation of MRI Images. IEEE; 2020. [DOI] [PubMed] [Google Scholar]
  • 26.Chatfield K., Simonyan K., Vedaldi A., Zisserman A. Return of the devil in the details: delving deep into convolutional nets. arXiv preprint arXiv:14053531. 2014 [Google Scholar]
  • 27.Adams M.A., Lama P., Zehra U., Dolan P. Why do some intervertebral discs degenerate, when others (in the same spine) do not? Clin Anat. 2015;28(2):195–204. doi: 10.1002/ca.22404. [DOI] [PubMed] [Google Scholar]
  • 28.Adams M.A., Roughley P.J. What is intervertebral disc degeneration, and what causes it? Spine. 2006;31(18):2151–2161. doi: 10.1097/01.brs.0000231761.73859.2c. [DOI] [PubMed] [Google Scholar]
  • 29.Lam K.S., Carlin D., Mulholland R.C. Lumbar disc high-intensity zone: the value and significance of provocative discography in the determination of the discogenic pain source. Eur Spine J. 2000;9(1):36–41. doi: 10.1007/s005860050006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Osti O.L., Vernon-Roberts B., Moore R., Fraser R.D. Annular tears and disc degeneration in the lumbar spine. A post-mortem study of 135 discs. J Bone Joint Surg Br. 1992;74(5):678–682. doi: 10.1302/0301-620X.74B5.1388173. [DOI] [PubMed] [Google Scholar]
  • 31.Park K.W., Song K.S., Chung J.Y., et al. High-intensity zone on L-spine MRI: clinical relevance and association with trauma history. Asian Spine J. 2007;1(1):38–42. doi: 10.4184/asj.2007.1.1.38. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Takatalo J., Karppinen J., Niinimäki J., et al. Association of modic changes, Schmorl's nodes, spondylolytic defects, high-intensity zone lesions, disc herniations, and radial tears with low back symptom severity among young Finnish adults. Spine (Phila Pa 1976. 2012;37(14):1231–1239. doi: 10.1097/BRS.0b013e3182443855. [DOI] [PubMed] [Google Scholar]
  • 33.Teraguchi M., Samartzis D., Hashizume H., et al. Classification of high intensity zones of the lumbar spine and their association with other spinal MRI phenotypes: the wakayama spine study. PLoS One. 2016;11(9) doi: 10.1371/journal.pone.0160111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Adams M.A., McNally D.S., Dolan P. Stress' distributions inside intervertebral discs. The effects of age and degeneration. J Bone Joint Surg Br. 1996;78(6):965–972. doi: 10.1302/0301-620x78b6.1287. [DOI] [PubMed] [Google Scholar]
  • 35.Dolan P., Earley M., Adams M.A. Bending and compressive stresses acting on the lumbar spine during lifting activities. J Biomech. 1994;27(10):1237–1248. doi: 10.1016/0021-9290(94)90277-1. [DOI] [PubMed] [Google Scholar]
  • 36.Lai M.K.L., Cheung P.W.H., Samartzis D., Karppinen J., Cheung K.M.C., Cheung J.P.Y. Clinical implications of lumbar developmental spinal stenosis on back pain, radicular leg pain, and disability. Bone Joint Lett J. 2021;103-B(1):131–140. doi: 10.1302/0301-620X.103B1.BJJ-2020-1186.R2. [DOI] [PubMed] [Google Scholar]
  • 37.Lai M.K.L., Cheung P.W.H., Samartzis D., Cheung J.P.Y. Prevalence and definition of multilevel lumbar developmental spinal stenosis. Global Spine J. 2022;12(6):1084–1090. doi: 10.1177/2192568220975384. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Lai M.K.L., Cheung P.W.H., Song Y.Q., Samartzis D., Cheung J.P.Y. Pedigree analysis of lumbar developmental spinal stenosis: determination of potential inheritance patterns. J Orthop Res. 2021;39(8):1763–1776. doi: 10.1002/jor.24850. [DOI] [PubMed] [Google Scholar]
  • 39.Lai M.K.L., Cheung P.W.H., Cheung J.P.Y. A systematic review of developmental lumbar spinal stenosis. Eur Spine J. 2020;29(9):2173–2187. doi: 10.1007/s00586-020-06524-2. [DOI] [PubMed] [Google Scholar]
  • 40.Teraguchi M., Cheung J.P.Y., Karppinen J., et al. Lumbar high-intensity zones on MRI: imaging biomarkers for severe, prolonged low back pain and sciatica in a population-based cohort. Spine J. 2020;20(7):1025–1034. doi: 10.1016/j.spinee.2020.02.015. [DOI] [PubMed] [Google Scholar]
  • 41.Cheung P.W.H., Fong H.K., Wong C.S., Cheung J.P.Y. The influence of developmental spinal stenosis on the risk of re-operation on an adjacent segment after decompression-only surgery for lumbar spinal stenosis. Bone Joint Lett J. 2019;101-B(2):154–161. doi: 10.1302/0301-620X.101B2.BJJ-2018-1136.R2. [DOI] [PubMed] [Google Scholar]
  • 42.Zhou F., Yang S., Fujita H., Chen D., Wen C. Deep learning fault diagnosis method based on global optimization GAN for unbalanced data. Knowl Base Syst. 2020;187 [Google Scholar]
  • 43.Cieslak D.A., Chawla N.V., editors. Learning Decision Trees for Unbalanced Data. Springer; 2008. [Google Scholar]
  • 44.Su L., Gong M., Zhang P., Zhang M., Liu J., Yang H. Deep learning and mapping based ternary change detection for information unbalanced images. Pattern Recogn. 2017;66:213–228. [Google Scholar]
  • 45.Dong Q., Gong S., Zhu X., editors. Class Rectification Hard Mining for Imbalanced Deep Learning. 2017. [Google Scholar]
  • 46.Dong Q., Gong S., Zhu X. Imbalanced deep learning by minority class incremental rectification. IEEE Trans Pattern Anal Mach Intell. 2018;41(6):1367–1381. doi: 10.1109/TPAMI.2018.2832629. [DOI] [PubMed] [Google Scholar]
  • 47.Ren S., Zhu W., Liao B., et al. Selection-based resampling ensemble algorithm for nonstationary imbalanced stream data learning. Knowl Base Syst. 2019;163:705–722. [Google Scholar]
  • 48.Yu L., Zhou R., Tang L., Chen R. A DBN-based resampling SVM ensemble learning paradigm for credit classification with imbalanced data. Appl Soft Comput. 2018;69:192–202. [Google Scholar]
  • 49.Perez L., Wang J. The effectiveness of data augmentation in image classification using deep learning. arXiv preprint arXiv:171204621. 2017 [Google Scholar]
  • 50.Xie Q., Dai Z., Hovy E., Luong T., Le Q. Unsupervised data augmentation for consistency training. Adv Neural Inf Process Syst. 2020;33:6256–6268. [Google Scholar]

Articles from Journal of Orthopaedics are provided here courtesy of Elsevier

RESOURCES