Abstract
Study Design:
Retrospective study.
Objective:
Lumbar magnetic resonance imaging (MRI) findings are believed to be associated with low back pain (LBP). This study sought to develop a new predictive classification system for low back pain.
Method:
Normal subjects with repeated lumbar MRI scans were retrospectively enrolled. A new classification system, based on the radiological features on MRI, was developed using an unsupervised clustering method.
Results:
One hundred and fifty-nine subjects were included. Three distinguishable clusters were identified with unsupervised clustering that were significantly correlated with LBP (P = .017). The incidence of LBP was highest in cluster 3 (57.14%), nearly twice the incidence in cluster 1 (30.11%). There were obvious differences in the sagittal parameters among the 3 clusters. Cluster 3 had the smallest intervertebral height. Based on follow-up findings, 27% of subjects changed clusters. More subjects changed from cluster 1 to clusters 2 or 3 (14.5%) than changed from cluster 2 or cluster 3 to cluster 1 (5%). Participation in sport was more frequent in subjects who changed from cluster 3 to cluster 1.
Conclusion:
Using an unsupervised clustering method, we developed a new classification system comprising 3 clusters, which were significantly correlated with LBP. The prediction of LBP is independent of age and better than that based on individual sagittal parameters derived from MRI. A change in cluster during follow-up may partially predict lumbar degeneration. This study provides a new system for the prediction of LBP that should be useful for its diagnosis and treatment.
Keywords: low back pain, lumbar degeneration, MRI, machine learning, unsupervised clustering
Background
With an aging population, low back pain (LBP) has become one of the diseases with significant effects on quality of life and also causes disability.1,2 Lumbar degeneration is well accepted as a major cause of LBP. 3 Magnetic resonance imaging (MRI) is widely used to evaluate lumbar degenerative diseases because of its advantages, including absence of radiation, ability to perform multiplanar imaging, provides excellent spinal soft-tissue contrast, and allows the radiologist to determine the precise location of intervertebral disc changes.4-7 Lumbar degeneration is frequently detected on MRI scans. Although there is no firm evidence for the presence or absence of a causal relationship between radiological findings and LBP, many studies have suggested that there should be such a relationship.8-12
Machine learning with artificial intelligence (AI) can be used to analyze data without supervision, so that information can be obtained that has not been available with the traditional method of image analysis in the past. Therefore, we developed an AI strategy to rapidly evaluate the characteristics of lumbar degeneration on MRI. This strategy analyzed the data by unsupervised machine learning and developed a new classification system based on the radiological features on MRI in an unsupervised manner, in a process called ‘unsupervised clustering’. This study sought to develop a new predictive classification system for LBP using a sample of the general population with repeated lumbar MRI scans with an unsupervised clustering method.
Methods
Patient Selection
To investigate lumbar degeneration in a relatively normal population over time, subjects underwent repeated lumbar MRI scans (which are a routine part of a value-added health package) over a 10-year period at the Health Consultation Department of Zhongshan Hospital (Shanghai, China) and for whom we had contact information were retrospectively enrolled in this study. The interval between the first and last scans was > 3 years in each subject. The study was approved by the Ethical Committee of Zhongshan Hospital affiliated with Fudan University (Shanghai, China) (B2019-220 R), which provided exemption for the requirement to obtain written informed consent. Data was anonymized before their transmission and analysis.
Radiographic and Clinical Assessments
All radiographic assessments were performed with the automatic spine measure system based on U-Net. 13 (For the detailed method, see the Appendix). The lumbar lordosis angle (LL) was defined as the angle between the superior endplates of L1 and S1. The lumbosacral angle (LS) was calculated as the angle between the horizontal line and the upper endplate of S1. The medial disc height (MDH) was calculated as the distance between the 2 intersections of the medial curve with the inferior and superior endplates of consecutive vertebral bodies. MDH was adjusted to the height of the upper vertebra, and expressed as L12M/L1 M, L23M/L2 M, L34M/L3 M, L45M/L5 M, or L5S1M/L5 M. The mean disc signal intensity and the variance of the disc intensity were computed. The disc signal intensity (DI) was adjusted by the signal intensity of the cerebrospinal fluid (CSF), as DI/CSF%.
Subjects were followed-up by telephone. Information on the subjects’ demographics, including their sex, age (years) at the time of the MRI assessments, bodyweight (kg), height (cm), smoking status, participation in sport, occupation, sedentary lifestyle, and pain status were collected using a standardized questionnaire. Body mass index (BMI: kg/m2) of the subject was calculated (kg/m2) according to the guidelines for Asians proposed by the World Health Organization. 14 Smoking status was defined as ‘current smoker’ or not. ‘Drinking’ was defined as the consumption of ≥ 3 alcoholic drinks/day and ‘nondrinking’ as < 3 drinks/per. Participation in sport was defined as regular engagement in any kind of routine exercise, with a minimum frequency of twice per week. Occupation was categorized as sedentary or a light, medium, heavy, or very heavy workload, according to a scheme for the classification of jobs based on workload. However, because the occupations of only eight subjects were classified as ‘heavy’ or ‘very heavy’, the subjects were regrouped into the categories ‘light’ (subjects with a sedentary job) and ‘not light’ (subjects with a medium, heavy, or very heavy physical workload) for subsequent analyses. A sedentary lifestyle was defined as sitting for > 8 h per day. The presence of LBP was defined as continuous localized pain for ≥ 2 weeks between MRI scans.
Clustering
Curve clustering was based on MDH and DI. First, rigid registration was used to adjust the intersubject nuisance variation (e.g., posture, curve length). Based on the registered results, MDH and DI were refitted according to the procedure described above. The MDH and DI features were clustered by hierarchical clustering with complete linkage. A dendrogram of cluster agglomeration from k = number of subjects through to k = 1 cluster(s) was visualized, and from incremental candidate selections of 1 to 15 clusters, cluster groupings were performed at the similarity levels that allowed exactly k clusters. This clustering algorithm is a variant of the traditional k-means clustering algorithm, which integrates a probabilistic seeding initialization method. The selection of the right number of clusters k is based on the validity ratio, which minimizes the intracluster distance and maximizes the intercluster distance. In this study, the clustering threshold distance for clustering was chosen to be 70% of the maximum linkage distance that resulted in 3 clusters, as shown in Figure 1.
Figure 1.
Clustering is done by hierarchical clustering and is plotted in (A), where each column is a sample and row is a feature. The rows represent the slopes of the curves at each pixel and the columns represent the samples. The bright color represents the slope in the anterior direction and the dark color represents the slope in the posterior direction. Three clusters are visualized by curves (B) and tSNE scatter plot (C). Many radiology features are associated with the cluster as indicated by Kruskal Wallis test (D), where the crosses denote significant features.
Statistical Analysis
Data was analyzed with SPSS software version 20.0 (IBM Corp., Armonk, NY, USA). Descriptive statistics were summarized as frequencies and percentages for categorical variables and as means ± standard deviations (SD) for continuous variables. Analysis of variance (ANOVA) and Pearson’s correlation coefficient were used to compare the differences among clusters. A P value of < .05 was considered statistically significant.
Results
General Characteristics of the Subjects
One hundred and fifty-nine subjects who underwent at least 2 lumbar MRI scans were included in this study. The interval between the first and last scans was > 3 years (mean = 4.6). Overall, 96 (60.4%) subjects were male and 63 (39.6%) were female. The mean age was 45.22 ± 8.70 years, with a range of 24–67 years.
Investigation of Clusters and their Change During Follow-up
Using the method described in the “Clustering” section, 3 distinguishable clusters of patterns were identified. Intuitively, it was observed that the curvatures were different among these 3 clusters. The curves of Cluster 1 and 2 were C-shaped, while Cluster 1 with a larger lordosis and Cluster 2 was straighter. Cluster 3 presented a reverse S-shaped curve, with a kyphosis at the junction of thoracolumbar. The apex of the lumbar lordosis was also lower than the of Cluster 1 and 2. Then we analyzed the curve characteristics of each cluster, including lordosis angle, disc height and DI. The results indicated that the curve clusters were significantly associated with disc height and DI (Table 1). The intervertebral height was largest in cluster 1 and smallest in cluster 3. The DI was lowest in cluster 2, but was not significantly different between clusters 1 and 3. LL and LS were largest in cluster 2, followed by cluster 1 and cluster 3.
Table 1.
The Distinguishable Features of Clusters.a
Parameter | Cluster1 | Cluster2 | Cluster3 | P value |
---|---|---|---|---|
LL | 44.07 ± 8.54 (42.31,45.84) |
46.95 ± 11.02 (43.33,50.58) |
37.91 ± 8.12 (34.76,41.06) |
0 |
LS | 38.24 ± 7.04 (36.79,39.69) |
39.31 ± 9.34 (36.25,42.39) |
33.78 ± 6.34 (31.32,36.24) |
.009 |
L12M/L1M | 0.53 ± 0.07 (0.51,0.54) |
0.52 ± 0.06 (0.50,0.54) |
0.42 ± 0.07 (0.40,0.45) |
0 |
L23M/L2M | 0.59 ± 0.07 (0.58,0.61) |
0.56 ± 0.08 (0.53,0.58) |
0.46 ± 0.05 (0.44,0.48) |
0 |
L34M/L3M | 0.65 ± 0.07 (0.63,0.66) |
0.60 ± 0.08 (0.57,0.62) |
0.50 ± 0.06 (0.47,0.52) |
0 |
L45M/L4M | 0.67 ± 0.10 (0.65,0.69) |
0.60 ± 0.08 (0.57,0.63) |
0.52 ± 0.06 (0.50,0.55) |
0 |
L51M/L5M | 0.61 ± 0.13 (0.58,0.64) |
0.57 ± 0.10 (0.54,0.60) |
0.50 ± 0.07 (0.47,0.53) |
0 |
L12DI/CSF% | 32.33 ± 6.26 (31.04,33.63) |
23.05 ± 5.30 (21.30,24.79) |
33.16 ± 6.69 (30.57,35.76) |
0 |
L23DI/CSF% | 28.97 ± 5.96 (27.75,30.20) |
20.21 ± 6.29 (18.15,22.28) |
30.26 ± 6.53 (27.73,32.80) |
0 |
L34DI/CSF% | 26.02 ± 6.69 (24.64,27.40) |
16.15 ± 5.56 (14.32,17.98) |
24.21 ± 6.56 (21.66,26.75) |
0 |
L45DI/CSF% | 22.00 ± 7.20 (20.52,23.48) |
12.34 ± 3.77 (11.10,13.58) |
19.09 ± 7.44 (16.29,20.83) |
0 |
L51DI/CSF% | 24.33 ± 10.48 (22.17,26.49) |
15.98 ± 6.31 (13.91,18.06) |
16.37 ± 7.71 (13.38,19.36) |
0 |
a Data are presented as Mean ± SD (95%CI). P < .05 was accepted as significant.
There were 93 subjects included in cluster 1, with a mean age of 43.9 years; 38 in cluster 2, with a mean age of 51.0 years; and 28 in cluster 3, with a mean age of 41.8 years. At the last follow-up, 78 subjects were included in cluster 1, 34 in cluster 2, and 47 in cluster 3. Forty-three (27%) subjects were moved to a different cluster during the follow-up (Table 2).
Table 2.
The Distribution of Clusters in first and Last Scans.
Last First | 1 | 2 | 3 | Total |
---|---|---|---|---|
1 | 70 | 10 | 13 | 93 |
2 | 6 | 22 | 10 | 38 |
3 | 2 | 2 | 24 | 28 |
Total | 78 | 34 | 47 | 159 |
New Classification Method and Clinical Outcomes
Sixty-two (39%) subjects had LBP. During the follow-up, the incidence of LBP was greatest in cluster 3 (16/28, 57.14%), nearly twice that in cluster 1 (28/93, 30.11%). Cluster 2 had a moderate incidence of LBP (18/38, 47.37%), which was close to that of cluster 3. The incidence of LBP differed significantly between cluster 1 and cluster 3 (P = .017; Table 3).
Table 3.
Custers and LBP.a
Cluster | Total | LBP(N) | N% |
---|---|---|---|
1 | 93 | 28 | 30.11% |
2 | 38 | 18 | 47.37% |
3 | 28 | 16 | 57.14% |
ANOVA test | P = .017 |
a P < .05 was accepted as significant.
Cluster Change During Follow-Up
Among the 43 subjects (27%) whose cluster was changed during the follow-up, 23 were moved from cluster 1 to clusters 2 or 3, while just 8 subjects were moved from clusters 2 or 3 to cluster 1. Two subjects were moved from cluster 3 to cluster 1; both subjects exercised regularly. Among the subjects who moved from cluster 1 to cluster 3, only 3 exercised regularly (Table 4). However, the association between Cluster change and sport was not significant.
Table 4.
Cluster Change and Participation in Sports.
Cluster change | Participation in sport | |
---|---|---|
0 | 1 | |
-2 | 0 | 2 |
-1 | 5 | 2 |
0 | 86 | 25 |
1 | 9 | 8 |
2 | 10 | 3 |
Further analysis detected the correlation between participation in sport and LBP in univariate analysis (Pearson’s correlation coefficient r = 0.03), and did not reveal a significant association between LBP and sport participation.
Discussion
This study was based on a relatively normal population of subjects. The prevalence of LBP (39%) in this study was similar to the prevalence in previous studies. 1 The prediction of LBP is difficult and there is currently no classification system for predicting LBP. 15 In this study, using unsupervised clustering, we developed a new classification system with 3 clusters that was significantly correlated with LBP (P = .017). The incidence of LBP was greatest (57.14%) in cluster 3, nearly double that in cluster 1 (30.11%), a statistically significant difference. The subjects in cluster 3 were younger (41.8 years) than those in cluster 1 (43.9 years), which suggests that the difference in LBP was not caused by age.
There were clear differences in the sagittal parameters among the 3 clusters. Cluster 3 contained subjects with the smallest intervertebral height, but not the lowest DI. Previous studies have shown that MRI findings, such as disc herniation and reduced disc height, are associated with LBP. 16 Therefore, we analyzed the relationships between intervertebral height, DI, and LBP separately, but found no significant correlation between any individual factors. Therefore, this classification system better predicts LBP than sagittal plane parameters alone.
During the follow-up period, 27% of subjects were moved to a different cluster. A greater proportion of subjects were moved from cluster 1 to clusters 2 or 3 (14.5%) than from clusters 2 or 3 to cluster 1 (5%). Lumbar degeneration is a continuous process, and we assume that this classification system can help to predict lumbar degeneration. Participation in sport was more frequent in those subjects who were moved from cluster 3 to cluster 1, although there was no significant correlation. Although we identified no direct relationship between participation in sport and LBP, other studies have reported that subjects who regularly participated in sport had less LBP. 17 Therefore, participation in sport may reverse or prevent lumbar degeneration and reduce the incidence of LBP. With a larger sample, more predictive factors associated with LBP should be identified.
We believe that this classification system can predict LBP and that this prediction is independent of age and better than that achieved with individual sagittal parameters. This is a good attempt to provide a new method of predicting LBP that could be implemented in medical practice. This AI algorithm made it possible that an effective tool can be developed with the function of automatic identification and classification of lumbar MRI at clinic. By this tool, doctors and their patients can easily obtain the key sagittal plane parameters and predict the risk of LBP in the future. It could also be an efficient and stable measurement for the analysis of medical big data, thereby conduct risk factor analysis in a larger population, discovering more high-risk populations, conducting health interventions for this population, and promoting population health and reducing social and personal costs.
Conclusion
Using unsupervised clustering of data from a relatively normal population, we have developed a new classification system of 3 clusters that was based on radiological features. This new classification system was significantly correlated with LBP (P = .017). The system was independent of age, and better than that achieved with individual sagittal parameters on MRI. A proportion of subjects (27%) were moved among the clusters during follow-up, predominantly from cluster 1 to clusters 2 or 3. This movement among clusters may help explain how the classification system can predict lumbar degeneration. This study is a successful attempt to provide a new way to predict LBP that can be implemented in medical practice, and should be helpful in the diagnosis and treatment of LBP.
Supplemental Material
Supplemental Material, sj-docx-1-gsj-10.1177_21925682211001813 for Predictive Classification System for Low Back Pain Based on Unsupervised Clustering by Lixia Jin, Chang Jiang, Lishu Gu, Mengying Jiang, Yuanlu Shi, Qixun Qu, Na Shen, Weibin Shi, Yuanwu Cao, Zixian Chen, Chun Jiang, Zhenzhou Feng, Linghao Shen and Xiaoxing Jiang in Global Spine Journal
Footnotes
Authors' Note: The development of AI was performed in Shenzhen Digital Life Institute, and other work was performed at Shanghai Zhongshan Hospital, Fudan University, Shanghai, People's Republic of China.
Author Contributions: The first three authors contributed equally to this manuscript as the first author. The last three authors contributed equally to this manuscript as the correspondent author.
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding: The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Each author certifies that neither he, nor any member of his or her immediate family, has funding or commercial associations (consultancies, stock ownership, equity interest, patent/licensing arrangements, etc.) that might pose a conflict of interest in connection with the submitted article. This study was funded by National Natural Science Foundation of China (81 801 375), Youth. The recipient is Yuanwu Cao. This study also supported by the National Key Research and Development Program of China (No. 2018YFC2000701). The funding source had not been involved in the design of the study and collection, analysis, and interpretation of data or in writing the manuscript.
Ethical Approval: This study was approved by the Ethical Committee of Zhongshan hospital affiliated to Fudan university (B2019-220 R), and exemption was granted for obtaining written informed consent.
ORCID iD: Xiaoxing Jiang, MD
https://orcid.org/0000-0002-4884-2683
Supplemental Material: Supplemental material for this article is available online.
References
- 1.Meucci RD, Fassa AG, Faria NM. Prevalence of chronic low back pain: systematic review. Rev Saude Publica. 2015;49:73. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Vos T, Flaxman AD, Naghavi M, et al. Years lived with disability (YLDs) for 1160 sequelae of 289 diseases and injuries 1990-2010: a systematic analysis for the Global Burden of Disease Study 2010. Lancet. 2012;380(9859):2163–2196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Brinjikji W, Diehn FE, Jarvik JG, et al. MRI findings of disc degeneration are more prevalent in adults with low back pain than in asymptomatic controls: a systematic review and meta-analysis. AJNR Am J Neuroradiol. 2015;36(12):2394–2399. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Foster NE, Anema JR, Cherkin D, et al. Prevention and treatment of low back pain: evidence, challenges, and promising directions. Lancet. 2018;391(10137):2368–2383. [DOI] [PubMed] [Google Scholar]
- 5.Haughton V. Imaging intervertebral disc degeneration. J Bone Joint Surg Am. 2006;88(2):15–20. [DOI] [PubMed] [Google Scholar]
- 6.Suthar P, Patel R, Mehta C. MRI evaluation of lumbar disc degenerative disease. J Clin Diagn Res. 2015;9(4):C4–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Yu L, Wang X, Lin X, et al. The Use of lumbar spine magnetic resonance imaging in eastern china: appropriateness and related factors. Plos One. 2016;11(1):e146369. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Cheung KM. The relationship between disc degeneration, low back pain, and human pain genetics. Spine J. 2010;10(11):958–960. [DOI] [PubMed] [Google Scholar]
- 9.Cheung KM, Karppinen J, Chan D, et al. Prevalence and pattern of lumbar magnetic resonance imaging changes in a population study of one thousand forty-three individuals. Spine (Phila Pa 1976). 2009;34(9):934–940. [DOI] [PubMed] [Google Scholar]
- 10.Jensen MC, Brant-Zawadzki MN, Obuchowski N, et al. Magnetic resonance imaging of the lumbar spine in people without back pain. N Engl J Med. 1994;331(2):69–73. [DOI] [PubMed] [Google Scholar]
- 11.Luoma K, Vehmas T, Riihimaki H. Disc height and signal intensity of the nucleus pulposus on magnetic resonance imaging as indicators of lumbar disc degeneration. Spine (Phila Pa 1976). 2001;26(6):680–686. [DOI] [PubMed] [Google Scholar]
- 12.Maatta JH, Wadge S, MacGregor A, Karppinen J, Williams FM. ISSLS prize winner: vertebral endplate (Modic) change is an independent risk factor for episodes of severe and disabling low back pain. Spine (Phila Pa 1976). 2015;40(15):1187–1193. [DOI] [PubMed] [Google Scholar]
- 13.Ronneberger O, Fischer P, Brox T.U-Net: convolutional networks for biomedical image segmentation. In: Joachim H, Nassir N, Alejandro F, William MW. eds Medical Image Computing and Computer-Assisted Intervention – MICCAI. Springer; 2015. [Google Scholar]
- 14.Choo V. WHO reassesses appropriate body-mass index for Asian populations. Lancet. 2002;360(9328):235. [DOI] [PubMed] [Google Scholar]
- 15.Mukasa D, Sung J. A prediction model of low back pain risk: a population based cohort study in Korea. Korean J Pain. 2020;33(2):153–165. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Luoma K, Vehmas T, Kerttula L. Chronic low back pain in relation to Modic changes, bony endplate lesions, and disc degeneration in a prospective MRI study. Eur Spine J. 2016;25(9):2873–2881. [DOI] [PubMed] [Google Scholar]
- 17.de Campos TF, Maher CG, Fuller JT, et al. Prevention strategies to reduce future impact of low back pain: a systematic review and meta-analysis. Br J Sports Med. 2020. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplemental Material, sj-docx-1-gsj-10.1177_21925682211001813 for Predictive Classification System for Low Back Pain Based on Unsupervised Clustering by Lixia Jin, Chang Jiang, Lishu Gu, Mengying Jiang, Yuanlu Shi, Qixun Qu, Na Shen, Weibin Shi, Yuanwu Cao, Zixian Chen, Chun Jiang, Zhenzhou Feng, Linghao Shen and Xiaoxing Jiang in Global Spine Journal