Abstract
Despite the availability of a series of tests, detection of chronic traumatic osteomyelitis is still exhausting in clinical practice. We hypothesized that machine learning based on computed-tomography (CT) images would provide better diagnostic performance for extremity traumatic chronic osteomyelitis than the serological biomarker alone. A retrospective study was carried out to collect medical data from patients with extremity traumatic osteomyelitis according to the criteria of musculoskeletal infection society. In each patient, serum levels of C-reactive protein (CRP), erythrocyte sedimentation rate (ESR), and D-dimer were measured and CT scan of the extremity was conducted 7 days after admission preoperatively. A deep residual network (ResNet) machine learning model was established for recognition of bone lesion on the CT image. A total of 28,718 CT images from 163 adult patients were included. Then, we randomly extracted 80% of all CT images from each patient for training, 10% for validation, and 10% for testing. Our results showed that machine learning (83.4%) outperformed CRP (53.2%), ESR (68.8%), and D-dimer (68.1%) separately in accuracy. Meanwhile, machine learning (88.0%) demonstrated highest sensitivity when compared with CRP (50.6%), ESR (73.0%), and D-dimer (51.7%). Considering the specificity, machine learning (77.0%) is better than CRP (59.4%) and ESR (62.2%), but not D-dimer (83.8%). Our findings indicated that machine learning based on CT images is an effective and promising avenue for detection of chronic traumatic osteomyelitis in the extremity.
Keywords: computed tomography image, machine learning, osteomyelitis, serum biomarker
1. Introduction
Chronic traumatic osteomyelitis is defined as a long-term or persistent bone infection caused by local injury, with characteristics of inflammation-associated bone destruction with new bone formation simultaneously.[1] The incidence of this disease has increased up to 24.4/100,000 in this decade as a complication of open fracture and internal fixation surgery.[2] Nowadays, chronic traumatic osteomyelitis is still a tough problem for clinicians due to its long disease course, high risks of recurrence and disabilities, complex treatment, and increased socioeconomic costs.[3,4] The early detection of chronic traumatic osteomyelitis is the key for reducing the immense influence on patients and the society.[5] New method to improve the detective ability for bone infection after trauma is highly desirable in clinical practice.[1]
Conventional clinical practices of detection of chronic traumatic osteomyelitis are mainly based on medical history, clinical symptoms, laboratory test, imaging modalities, and pathological examination.[6] Indeed, serum levels of inflammation-related biomarkers play a key role in assistance for infection diagnosis.[7] The most widely used serological tests for the infectious disease include C-reactive protein (CRP) level, erythrocyte sedimentation rate (ESR), and D-dimer.[8,9] To improve the diagnostic accuracy for chronic traumatic osteomyelitis, imaging of the affected bone is also considered as a useful assistant tool. Computed-tomography (CT) scan is able to detect early osseous erosion and document the presence of sequestrum, foreign body, soft tissue swelling, periosteal reaction, and decreased attenuation of the medullary space.[10] CT scan is not expensive and widely available in most hospitals.[11] However, the ability to examine large quantity of CT images accurately is heavily dependent on personal experience, and detection of the bone infection is often too subjective.[12]
Recently, machine learning has emerged as a potential tool for lesion finding which is difficult for human experts by rapidly reviewing immense amounts of images.[13,14] Traditional machine learning focused on feature engineering mainly, which involved computing explicit features specified by experts, resulting in algorithms designed to detect disease pathology or specific lesions.[15] Deep learning is a kind of machine learning technique that avoids such engineering by learning the most predictive features directly from the images given a large dataset of labeled examples.[16] Meanwhile, it has potential benefits such as increasing efficiency, reproducibility, and coverage of screen programs, reducing barriers to access, and improving patient outcomes by providing early detection and treatment for disease.[13] This technique uses an optimization algorithm called back-propagation to indicate how a machine should change its internal parameters to best predict the desired output of an image.[13] The development of convolutional neural network layers has allowed for significant gains in the ability to classify images and detect objects in a picture.[17] Recent studies on deep learning demonstrated a marked elevation in the diagnostic ability of retinopathy and pneumonia.[18] To the best of our knowledge, there is still no research in the literature to explore machine learning algorithm for the detection of chronic traumatic osteomyelitis in the extremity.
In this study, a deep learning algorithm of deep residual network (ResNet) for image classification was established for effective detection of extremity chronic traumatic osteomyelitis. Meanwhile, the preoperative diagnostic performance of machine learning based on CT images was also compared with serological biomarkers including CRP, ESR, and D-dimer.
2. Materials and methods
2.1. Data collection
A retrospective study on the detection of extremity chronic traumatic osteomyelitis was conducted in the Second Affiliated Hospital of Zhejiang University School of Medicine with approval of the Ethical Medical Committee. Written consents from the participants were waived due to the retrospective design of the present study. The patients’ personal information was anonymized and de-identified before analysis. Index terms of chronic osteomyelitis were set for search between January 1, 2014, and January 1, 2018. The records retrieved initially were reviewed for eligibility assessment. Patients’ data were collected by searching electronic medical records of the hospital. The design of our study is shown in Figure 1.
Figure 1.

Flowchart of the study.
Chronic traumatic osteomyelitis, defined as a persistent infection of bone marrow for more than 10 weeks due to injury, was diagnosed on the basis of intra-operative histopathological tests, or cultures from at least 2 infection sites with the same organism or a definite sinus tract connecting directly the tibial bone.[19,20] Records of the enrolled patients contained the following data: gender, age, infected anatomical site, intra-operative microorganism culture outcome, treatment strategies, preoperative serum biomarkers, and CT images after admission. Eligible patients were those with a definite diagnosis of extremity chronic traumatic osteomyelitis, with serum levels of CRP, ESR, D-dimer, and CT scan images in 7 days after admission preoperatively. Excluded were the records of patients diagnosed with acute osteomyelitis, septic arthritis, chronic osteomyelitis in the non-extremity bones (e.g., skull and maxilla), and non-traumatic chronic osteomyelitis. In addition, patients with other comorbidities (e.g., tumor, autoimmune diseases, multiple myeloma, and other system inflammations) that might affect serum biomarker levels were also excluded. If a patient had multiple medical records for multiple hospitalizations, only the most relevant records (chronic traumatic osteomyelitis-associated data) were kept for analysis.
The positive rate of each biomarker was defined as the number of patients over the upper limit of normal value of the biomarker divided by the total number of patients with medical records of such a biomarker. Serum levels of ESR were detected using automatic ESR analyzer (Electa Lab, XC-40B, Forli, Italy). CRP level was detected using automatic biochemical analyzer (Beckman, AU5421/AU5431, CA, USA). D-dimer level was detected using automated blood coagulation analyzer (Siemens, SysmexCS5100, Germany). ESR for males: 15 mm/h and for females: 20 mm/h; CRP: 5 mg/l; D-dimer: 500 ng/ml.[7] CT scans of the affected limb for all patients were conducted for lesion detection and evaluation in the Second Affiliated Hospital of Zhejiang University School of medicine. For CT scanning, axial slices in high-resolution mode (0.33 mm thick) were obtained from the proximal to distal tibia using a SOMATOM Definition AS System (Siemens, Germany). The field of view was set to 24 cm with 120 kV (peak) and 200 mA. Images window levels were set to bone levels (window = 2,000, level = 300). The CT scan images were exported from the picture archiving and communication systems in the hospital and observed in the Image J software. The images around the bone lesion including some normal part were used for further evaluation and labeling. The CT images were labeled as infection or non-infection by three experienced doctors with consistency.
2.2. Image pre-processing
There were fixed positions of texts in the original CT images (Fig. 2A), which could negatively affect subsequent recognition. Photoshop software version CC (Creative Cloud) was used in our work to clip the fixed areas where text existed and fill these areas with black pixels (Fig. 2B).
Figure 2.

Processing of CT image. (A) Pre-processing of the image. (B) Post-processing of the image.
2.3. Deep learning algorithm development
The CT images were loaded onto a computer with the Keras deep learning framework, with CUDA 9.0/cuDNN 7.0 dependencies for graphics processing unit acceleration. Residual network was used in our work to recognize CT images.[21] When training a deep neural network with the increasing of layers, it would probably face the problem of degradation. It could be understood that for different inputs, only a small number of hidden units in each layer change their activation values, while the rest remained unchanged. This problem would lead to an initial rising first followed by a rapid falling in the accuracy rate. Residual learning framework could solve the problem of degradation. The building block of residual network is shown in Figure 3.
Figure 3.

The building block of residual network.
H(x) was applied to denote the underlying mapping. Another mapping of F(x) = H(x) – x was used for the stacked nonlinear layers. The original map was represented as F(x) + x. The formulation of F(x) + x was realized by the feedforward neural networks with “shortcut connections.” The shortcut connection was utilized for identity mapping, and their outputs were added to the outputs of the stacked layers. Then stochastic gradient descent was carried out to train the network after adding shortcut connections. It had been confirmed that better classification accuracy could be acquired by optimization of residual networks with the depth of networks increasing.
2.4. Evaluating the algorithm
The original dataset had 28,718 CT images from 163 different patients, including 7668 infection images and 21,050 non-infection images. The CT images of each patient were relatively independent. We randomly extracted 80% of the CT images from each patient for training, 10% for validation, and 10% for testing. We artificially enlarged the training set by generating image translations and horizontal reflections. The training set was then applied to train the algorithm, and the validation set was used for adjusting hyper-parameters. The model could determine whether a CT image belonged to infection or non-infection. There was no overlap in three sets. On the testing set, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio (PLR), and negative likelihood ratio (NLR) were used to evaluate the algorithm.
2.5. Statistical analysis
Descriptive statistical analysis of ESR, dimer, and CRP was conducted for all variables. Statistical analysis was conducted using the SPSS 19.0 software (SPSS Inc, Chicago, IL). Continuous variables were revealed as the mean ± standard deviation with interquartile range (IQR) depending on data distribution. Dichotomous variables were expressed as percentages. Chi-Squared test was used to compare rates among different groups with corresponding odds ratios (ORs) and 95% confidence intervals (CIs). A P value of <.05 was defined as statistical significance.
3. Results
A total of 163 cases were enrolled in our study, including 114 males and 49 females, with a median age of 50 years. The culture results of these patients with chronic traumatic osteomyelitis are listed in Table 1. The rate of positive cultures in the cohort was 45.40% (74/163) and the rate of negative cultures was 54.60% (89/163) for extremity chronic traumatic osteomyelitis. The mean serum CRP was 12.82 mg/L (quartile, 2.15, 11.60 mg/L). The mean serum ESR was 23.75 mm/hr (quartile, 6.00, 30.50 mm/hr). The mean serum D-dimer level in osteomyelitis was 787.05 ng/mL (quartile, 305, 785 ng/mL).
Table 1.
Culture results of patients for chronic extremity osteomyelitis.

The CRP showed a sensitivity and specificity of 50.6% (95% CI 39.8% to 61.2%) and 59.4% (95% CI 47.4% to 70.5%). As for ESR, the sensitivity and specificity are 73.0% (95% CI 62.4% to 81.6%) and 62.2% (95% CI 50.1% to 73.0%). The D-dimer demonstrated a sensitivity and specificity of 51.7% (95% CI 46.6% to 62.3%) and 83.8% (95% CI 73.0% to 91.0%). For CT image recognition based on machine learning, 2884 CT images from 163 patients were used for testing. We took 10% images from each patient for recognition. The sensitivity of image recognition was 70%. At the image level, the recognition accuracy was 93%, the specificity was 94%, and the sensitivity was 90%. At the patient level, if the patient had 1 or more CT images identified as osteomyelitis, the patient was identified as an osteomyelitis patient, which is consistent with clinical practice. With this CT image recognition standard, the sensitivity and specificity were 88.0% (95% CI 79.9% to 94.2%) and 77.0% (95% CI 65.5% to 85.7%). In conclusion, the machine learning showed the highest accuracy (83.4%) when compared with CRP (53.2%), ESR (68.8%), and D-dimer (68.1%). The detailed results for the performance of serological biomarkers and deep learning for diagnosis of extremity chronic traumatic osteomyelitis are shown in Table 2.
Table 2.
Performance of serum tests and machine learning for diagnosing chronic extremity osteomyelitis with 95% confidence interval∗.

4. Discussion
The detection of extremity chronic traumatic osteomyelitis is still time consuming and troublesome for clinicians.[1] The ever-increasing use of digital imaging provides opportunities to establish rich and deep datasets for disease lesion analysis objectively and precisely.[13] In the present study, a deep learning model of ResNet algorithm based on CT images was established for recognizing extremity chronic traumatic osteomyelitis. Our findings demonstrated that machine learning based on CT images could provide more accurate and efficient detection for extremity chronic traumatic osteomyelitis than serum biomarkers such as CRP, ESR, and D-dimer separately.
In clinical practice, CT scan is widely performed for chronic osteomyelitis due to its less cost and more convenience when compared with other imaging methods.[12] However, a previous research showed that the CT evaluation only has a sensitivity of 67% and a specificity of 50% in chronic osteomyelitis.[22] In fact, the performance of CT examination for disease diagnosis is limited due to its large variations in pathology, its decreased soft tissue contrast, and potential fatigue of human experts.[10,13] With the development of machine learning, computer-assisted diagnosis based on images is more efficient and objective than human assessment for certain disease.[18] The deep learning neural network can recognize key areas of interest on the CT images which correspond to the pathological lesion.[23] Through discovering morphological and textural patterns in images from data, researchers were able to achieve 90.3% sensitivity and 98.1% specificity for detecting referable diabetic retinopathy on validation set.[24] The main challenge in applying deep learning for recognizing chronic osteomyelitis arises from the limited number of available training samples to build deep models without overfitting.[14] In this study, we generated artificial samples via affine transformation and then trained the network from scratch with the expanded dataset. Meanwhile, considering the structure among neighboring pixels or voxels as the important source of information, we designed a convolutional neural network to utilize the spatial configuration effectively by taking images per se as input.[25] Based on above procedures, we had a significant improvement in diagnostic accuracy of CT evaluation to 83.4% for chronic osteomyelitis based on deep learning algorithm.
In the present study, we compared the role of machine learning based on CT images with the serum biomarkers. Nowadays, serological biomarker is still tested for the detection of chronic osteomyelitis due to its assistant value. However, its diagnostic accuracy is limited due to complex factors including disease nature, severity of infection, pathogens, and courses of the bone infection. According to a previous study, CRP and ESR had a positive rate of 54.65% and 65.57% respectively, which was consistent with our findings.[7] Indeed, machine learning showed better performance than single serum biomarker on accuracy and sensitivity in extremity chronic traumatic osteomyelitis. Interestingly, our results indicated that D-dimer showed a higher specificity (83.8%) than machine learning (77.0%). Recently, a prospective study showed that D-dimer might be a potential indicator for diagnosis of periprosthetic joint infection with a sensitivity of 89% and specificity of 93%.[9] For the higher sensitivity in periprosthetic joint infection, we speculated that periprosthetic joint infection was located in the joint and the inflamed synovium secretes a large amount of fibrin, which was degraded subsequently and leads to the increased concentration of D-dimer in serum.[26] As for chronic extremity osteomyelitis, the lesion is mainly located on the tibial bone without synovium and its characteristics are quite different from those of periprosthetic joint infection.[8,27,28] However, the high specificity of D-dimer indicated its potential value for follow-up evaluation in chronic traumatic osteomyelitis after treatment.
Indeed, our study still has several limitations. First, we only included images from patients who met our study criteria and the neural network was trained on these images. With more available dataset from other institutions, deep learning model might find more generalized features in medical images, thus allowing a further improvement in performance. Second, we had not applied this technology for the detection with the samples or images from the unknown patients, as would be more useful and more realistic for the application of this technology in the hospitals. Furthermore, when investigating the underlying patterns in CT images by applying deep learning, because of its black-box-like characteristics of deep model, it still remains challenging to understand and interpret the learned model intuitively.[14] Second, we used only 3 serological biomarkers for assessment of chronic osteomyelitis due to its retrospective and observational design.
5. Conclusion
In this study, a residual (ResNet) algorithm based on CT images showed better diagnostic performance compared to serological biomarkers such as CRP, ESR, and D-dimer for extremity chronic traumatic osteomyelitis. In future, a multi-category classification system combining biological indicators and CT image-based machine learning might provide more practical significance and generality for chronic osteomyelitis.
Acknowledgments
The authors would like to acknowledge the staff of orthopedic lab of Zhejiang University for their assistance.
Author contributions
Gang Feng: data collection and analysis.
Guangyao Jiang: collection of data, data analysis and interpretation.
Jianqiao Hong: data analysis and interpretation.
Ruijian Yan: conception and design, financial support, final approval of manuscript.
Shenghong Mou: image processing, data analysis.
Shiming Chen: collection and assembly of data.
Sihao Li: collection of data, data analysis and interpretation.
Weijie Lin: model establishment, image processing, data analysis.
Xinlu: model establishment, image processing, data analysis.
Yifan Wu: concept and design, collection and assembly of data, data analysis and interpretation, manuscript writing, final approval of manuscript.
Zhiyuan Cheng: conception and design, data analysis and interpretation, revision of the manuscript, financial support.
Ruijian Yan orcid: 0000-0003-1572-6675.
Footnotes
Abbreviations: CIs = confidence intervals, CRP = C-reactive protein, CT = computed-tomography, ESR = erythrocyte sedimentation rate, IQR = interquartile range, NLR = negative likelihood ratio, NPV = negative predictive value, ORs = odds ratios, PLR = positive likelihood ratio, PPV = positive predictive value, ResNet = deep residual network.
How to cite this article: Wu Y, Lu X, Hong J, Lin W, Chen S, Mou S, Feng G, Yan R, Cheng Z. Detection of extremity chronic traumatic osteomyelitis by machine learning based on computed-tomography images: a retrospective study. Medicine. 2020;99:9(e19239).
YW and XL contributed equally to this work.
This work was supported by the National Key R&D Program of China (2018YFA0701401), the Joint Construction Project of Zhejiang Province and Health Ministry (WKJ-ZJ-2029), and the Foundation of Zhejiang Educational Committee (Y201839065, Y201941414).
The authors have no financial conflicts of interest to disclose.
References
- [1].Hatzenbuehler J, Pulling TJ. Diagnosis and management of osteomyelitis. Am Fam Physician 2011;84:1027–33. [PubMed] [Google Scholar]
- [2].Kremers HM, Nwojo ME, Ransom JE, et al. Trends in the epidemiology of osteomyelitis: a population-based study, 1969 to 2009. J Bone Joint Surg Am 2015;97:837–45. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [3].Thakore RV, Greenberg SE, Shi H, et al. Surgical site infection in orthopedic trauma: a case-control study evaluating risk factors and cost. J Clin Orthop Trauma 2015;6:220–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [4].Mouzopoulos G, Kanakaris NK, Kontakis G, et al. Management of bone infections in adults: the surgeon's and microbiologist's perspectives. Injury 2011;42: Suppl 5: S18–23. [DOI] [PubMed] [Google Scholar]
- [5].Lima AL, Oliveira PR, Carvalho VC, et al. Recommendations for the treatment of osteomyelitis. Braz J Infect Dis 2014;18:526–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Lew DP, Waldvogel FA. Osteomyelitis. Lancet 2004;364:369–79. [DOI] [PubMed] [Google Scholar]
- [7].Jiang N, Qin CH, Hou YL, et al. Serum TNF-alpha, erythrocyte sedimentation rate and IL-6 are more valuable biomarkers for assisted diagnosis of extremity chronic osteomyelitis. Biomark Med 2017;11:597–605. [DOI] [PubMed] [Google Scholar]
- [8].Eid AJ, Berbari EF. Osteomyelitis: review of pathophysiology, diagnostic modalities and therapeutic options. J Med Liban 2012;60:51–60. [PubMed] [Google Scholar]
- [9].Shahi A, Kheir MM, Tarabichi M, et al. Serum D-dimer test is promising for the diagnosis of periprosthetic joint infection and timing of reimplantation. J Bone Joint Surg Am 2017;99:1419–27. [DOI] [PubMed] [Google Scholar]
- [10].Gold RH, Hawkins RA, Katz RD. Bacterial osteomyelitis: findings on plain radiography, CT, MR, and scintigraphy. Am J Roentgenol 1991;157:365–70. [DOI] [PubMed] [Google Scholar]
- [11].Pineda C, Vargas A, Rodriguez AV. Imaging of osteomyelitis: current concepts. Infect Dis Clin North Am 2006;20:789–825. [DOI] [PubMed] [Google Scholar]
- [12].Termaat MF, Raijmakers PG, Scholten HJ, et al. The accuracy of diagnostic imaging for the assessment of chronic osteomyelitis: a systematic review and meta-analysis. J Bone Joint Surg Am 2005;87:2464–71. [DOI] [PubMed] [Google Scholar]
- [13].Shen D, Wu G, Suk HI. Deep learning in medical image analysis. Annu Rev Biomed Eng 2017;19:221–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [14].McBee MP, Awan OA, Colucci AT, et al. Deep learning in radiology. Acad Radiol 2018;25:1472–80. [DOI] [PubMed] [Google Scholar]
- [15].Sargent DJ. Comparison of artificial neural networks with other statistical approaches: results from medical data sets. Cancer 2001;91: Suppl 8: 1636–42. [DOI] [PubMed] [Google Scholar]
- [16].Suzuki K. Overview of deep learning in medical imaging. Radiol Phys Technol 2017;10:257–73. [DOI] [PubMed] [Google Scholar]
- [17].Guo YM, Liu Y, Georgiou T, et al. A review of semantic segmentation using deep neural networks. Int J Multimed Inf Retr 2018;7:87–93. [Google Scholar]
- [18].Kermany DS, Goldbaum M, Cai W, et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 2018;172:1122–31. [DOI] [PubMed] [Google Scholar]
- [19].Zimmerli W. Clinical presentation and treatment of orthopaedic implant-associated infection. J Intern Med 2014;276:111–9. [DOI] [PubMed] [Google Scholar]
- [20].Metsemakers WJ, Kuehl R, Moriarty TF, et al. Infection after fracture fixation: current surgical and microbiological concepts. Injury 2018;49:511–22. [DOI] [PubMed] [Google Scholar]
- [21]. He KM, Zhang XY, Ren SQ et al. Deep residual learning for image recognition. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2016:770–8. [Google Scholar]
- [22].Whalen JL, Brown ML, McLeod R, et al. Limitations of indium leukocyte imaging for the diagnosis of spine infections. Spine (Phila Pa 1976) 1991;16:193–7. [PubMed] [Google Scholar]
- [23].van Tulder G, de Bruijne M. Combining generative and discriminative representation learning for lung CT analysis with convolutional restricted Boltzmann machines. IEEE Trans Med Imaging 2016;35:1262–72. [DOI] [PubMed] [Google Scholar]
- [24].Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 2016;316:2402–10. [DOI] [PubMed] [Google Scholar]
- [25].Lecun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition. P IEEE 1998;86:2278–324. [Google Scholar]
- [26].Busso N, Hamilton JA. Extravascular coagulation and the plasminogen activator/plasmin system in rheumatoid arthritis. Arthritis Rheum 2002;46:2268–79. [DOI] [PubMed] [Google Scholar]
- [27].Elgeidi A, Elganainy AE, Abou EN, et al. Interleukin-6 and other inflammatory markers in diagnosis of periprosthetic joint infection. Int Orthop 2014;38:2591–5. [DOI] [PubMed] [Google Scholar]
- [28].Michail M, Jude E, Liaskos C, et al. The performance of serum inflammatory markers for the diagnosis and follow-up of patients with osteomyelitis. Int J Low Extrem Wounds 2013;12:94–9. [DOI] [PubMed] [Google Scholar]
