Abstract
Objective:
To identify the feasibility and efficiency of deep convolutional neural networks (DCNNs) in the detection of ankle fractures and to explore ensemble strategies that applied multiple projections of radiographs.
Ankle radiographs (AXRs) are the primary tool used to diagnose ankle fractures. Applying DCNN algorithms on AXRs can potentially improve the diagnostic accuracy and efficiency of detecting ankle fractures.
Methods:
A DCNN was trained using a trauma image registry, including 3102 AXRs. We separately trained the DCNN on anteroposterior (AP) and lateral (Lat) AXRs. Different ensemble methods, such as “sum-up,” “severance-OR,” and “severance-Both,” were evaluated to incorporate the results of the model using different projections of view.
Results:
The AP/Lat model’s individual sensitivity, specificity, positive-predictive value, accuracy, and F1 score were 79%/84%, 90%/86%, 88%/86%, 83%/85%, and 0.816/0.850, respectively. Furthermore, the area under the receiver operating characteristic curve (AUROC) of the AP/Lat model was 0.890/0.894 (95% CI: 0.826–0.954/0.831–0.953). The sum-up method generated balanced results by applying both models and obtained an AUROC of 0.917 (95% CI: 0.863–0.972) with 87% accuracy. The severance-OR method resulted in a better sensitivity of 90%, and the severance-Both method obtained a high specificity of 94%.
Conclusion:
Ankle fracture in the AXR could be identified by the trained DCNN algorithm. The selection of ensemble methods can depend on the clinical situation which might help clinicians detect ankle fractures efficiently without interrupting the current clinical pathway.
Advances in knowledge:
This study demonstrated different ensemble strategies of AI algorithms on multiple view AXRs to optimize the performance in various clinical needs.
Background
Ankle joints are complex joints that handle the stabilization and motion of lower extremities. Ankle joints include multiple bones and several ligamentous attachments, muscular attachments, and a fibrous capsule to maintain articulation. The numbers of ankle fractures and the resulting post-surgical outcomes are significant public health concerns worldwide. 1–3 The incidence of ankle fractures is approximately 179–187 fractures per 100,000 people annually. 4,5 The incidence has increased significantly in many industrialized countries and is most likely due to the growth in the number of people involved in athletics and the size of the elderly population. 6–8
The vast majority of ankle fractures are malleolar fractures: 60–70% occur as unimalleolar fractures, 15–20% occur as bimalleolar fractures, and 7–12% occur as trimalleolar fractures. 4,6 The complicated fracture patterns and presentations might influence the detection of fractures at an initial examination. Early diagnosis and management might preserve the movement and stabilization function of the joint and the ambulation and quality of life of the patient. 9 Ankle radiographs (AXRs), including the different projections of the anteroposterior (AP) view and lateral (Lat) view, are essential and widely used to diagnose ankle fractures. 10 However, the diagnostic accuracy of ankle fractures by AXRs is not optimal. There are numerous small structures in close proximity to complex, nonlinear courses, such as the lateral ankle ligaments and intrinsic musculature, which can be difficult to distinguish without a thorough understanding of anatomy. Many of these structures are difficult to see without proper patient positioning, optimization of the scanning plane to image the desired anatomy, or adjusting image acquisition parameters to obtain the highest possible resolution. Previous studies showed that missed diagnosis, even by experts, was still as high as 5.6–42%, 11,12 and delayed diagnosis and management worsened the prognosis. To prevent further healthy sequelae and the medical cost of delayed diagnosis, additional radiographs, bone scans, CT, or MRI have been advised as diagnostic alternatives. 9 However, it is not efficient or economical to use these diagnostic tools for routine examination. Furthermore, these diagnostic modalities are not always available in all hospitals.
Digital medical image systems offer immediate and remote access 13 and increase the possibility of computer-aided diagnostics. 14 Computerized analysis based on deep learning has shown potential benefits as a diagnostic strategy and has become feasible in recent years. 15 Deep convolutional neural network (DCCN) learning has proven its ability to screen for several medical image lesions with adequate performance, 16–18 especially with supervised learning. 19 The advantages of time savings and increased diagnostic rates have been proven by numerous studies. The application and achievements of DCNNs in the medical field are expected to increase over the years. Some studies have presented that there is still an excellent opportunity to apply DCNNs in skeletal trauma, which is well-documented for hip, pelvis, knee, and wrist trauma. 17,20–22 Human-based AI models also have the ability to increase diagnostic accuracy in the emergency department. 23 However, an excellent algorithm to detect ankle fractures has not been thoroughly evaluated. Automated detection of ankle fractures has potential benefits, such as increased efficiency, decreased misdiagnosis rates, reduced possibilities of delayed management, and improved patient outcomes. 16–18 Most fracture detection algorithms focus on the accuracy of a single view of the target part. However, multiple projections of views are required to diagnose and evaluate extremity bone fractures in clinical practice. Integrating information from different views is critical and may provide better insight to evaluate trauma patients.
In this study, we trained a DCNN algorithm on an AXR data set and investigated the performance of a deep learning algorithm in detecting ankle fractures. In addition, we discover ensemble deep-learning methods to evaluate ankle fractures with bidirectional projection radiographic images.
Methods and materials
Study population
We acquired the Chang Gung Trauma Registry Program (CGTRP) from the Chang Gung Memorial Hospital (CGMH), Linkou, Taiwan. We extracted the data and images of all trauma patients from Jan 2012 to December 2016 from CGMH, which is a level I trauma center. Demographic, medical, perioperative, hospital course, medical image, and follow-up data and information regarding complications were recorded prospectively into a computerized database. The Internal Review Board of CGMH approved the study. The IRB number is 202002343B0.
AXR data set
Patient data in the trauma registry from August 2012 to Dec 2016 were evaluated and selected if the patient entered the emergency room with trauma and received plain ankle films (AXRs) on the date of injury. The AXR images included pairs of AP views and Lat views, which were stored automatically by a Python script and with their original sizes in JPEG format. Each was given a serial number and deidentified in both the images and registry. The patient data without paired images were stored separately. After the images were stored, all images were checked and deidentified. All labels in each image were carefully examined and removed.
Images labeling and pre-processing
After the AXR data sets were established, the images were initially labeled as presenting an ankle fracture or no ankle fracture according to the diagnosis in the trauma registry. A surgeon reviewed each image for the preciseness of the label. The radiologist report, diagnosis, clinical course, and other related images, such as CT or other views of the ankle joint, were reviewed if the label was questionable. Images with poor quality and other fractures, including tibial shaft fractures and calcaneus fractures, were excluded.
Data set splitting
The sample size of the test set was calculated using a power calculation to estimate the number of participants needed to detect a 15% difference in proportions between the two groups, with desired power of 0.8 and sensitivity of 0.85 for the study and a significant level of 0.05. The test set was preserved before the whole development procedure. 100 patient images, 50 with ankle fractures and 50 without ankle fractures, were identified. The remaining images were used for model development. Another 100 balanced patient images were used as the validation set, and the remaining images were used as the training set. The demographic distribution of each data set is shown in Table 1.
Table 1.
Demographic characteristics of ankle data set
| Training set (n = 1351) | Validation set (n = 100) | Test set (n = 100) | p-value | |
|---|---|---|---|---|
| Age (mean, SD) | 37.32 (18.09) | 37.60 (18.72) | 36.25 (17.34) | 0.834 |
| Gender | 0.075 | |||
| Male (n, %) | 776 (57.4) | 68 (68.0) | 63 (63.0) | |
| Female (n, %) | 575 (42.6) | 32 (32.0) | 37 (37.0) | |
| Ankle fracture | 0.001 | |||
| Yes (n, %) | 488 (36.1) | 50 (50.0) | 50 (50.0) | |
| No (n, %) | 863 (63.9) | 50 (50.0) | 50 (50.0) |
SD, Standard Deviation.
Development of the algorithm
DCNNs are widely used in computer vision, and medical image recognition and are one of the machine learning algorithms developed from artificial neural networks. The basic concept is to use pixel values from a digital image as input, apply techniques such as convolution and pooling in each layer, and adjust the weights in the neural network according to the difference between the output and accurate labels. After a significant number of images are input as the training material, the weights in the neural network are adjusted to fit the problem. We used Xception as our neural network structure. 24 The structure contains a depthwise separable convolution and residual connection. We performed two training process steps to enforce the local features of the fracture. The first step involved training on the original training set. After the training process converged, the model was inferred from the positive images in the training set again. The maximum prediction of the fracture site was located as a center point. Then, we cropped a 700 × 700 pixel square image around the target site. The cropped images were put into the training set to complete the second training step.
Ensemble bidirectional views algorithm
We trained the algorithm with the AP and Lat views of the AXR data independently and validated them for both views with their individual results and AUROC values. Then, we incorporated the fracture possibility results of both views as the final AXR result and began validation. The first method is to simply sum up the prediction value of the two models and evaluate the performance using the new value as the “sum-up” method. The second method is the “severance-OR” method, and it choose a cut-off value for each view and defines the patient as having a fracture if a fracture was detected by either model. The “severance-Both” method defines a fracture only if both models reveal a fracture.
Evaluating the algorithm
The trained ankle model was tested on a preserved data set to assess the accuracy of identifying ankle fractures. We randomly extracted 50/50 fracture data from the AP and Lat views to test the algorithm’s performance. We also use a visualization method named “grad-CAM” 24 with the Keras-vis library to generate a heatmap that is activated by the model when it identifies an ankle fracture as evidence that the model indeed recognized the fracture site. A radiologist also reviewed the heatmaps and compared them with the fracture site in the original image. The probability generated by the ankle fracture model was evaluated with the area under the receiver operating characteristic curve (AUROC). The confusion matrix was also calculated.
Implementation details
The software used to build the DCNN was based on an Ubuntu 18.04.3 LTS operating system on a workstation with an Intel(R) Core(TM) i9-10900X CPU @ 3.70 GHz, 96 GB RAM, and 2 NVIDIA TITAN RTX GPUs with the TensorFlow and a Keras open-source library with Python 3.6.9 (Python Software Foundation). The input image size is resized to 880 × 880 pixels with an 8-bit grayscale color to reduce the complexity and computation. We used ImageNet as our pre-training material. The pre-trained weights of the DCNN were preserved for AXR training. Image augmentation was randomly applied during the training process with zoom, rotation, width shifting, height shifting, shear transformation, and horizontal flipping operations. The class weight was adjusted according to the class distribution. The Adam optimizer and categorical cross-entropy loss were used to train the model for 60 epochs with a batch size of 4 and a starting learning rate of 1e−5. The model evaluation matrix was the accuracy. Two ankle models have been trained independently.
Statistical analysis
All statistical analyses were performed using R 3.4.3. Continuous variables were calculated with the t-test, and categorical variables were calculated with the χ2 test. We evaluated the ankle model with the AUROC. The cut-off point to evaluate the sensitivity, specificity, false-negative rate, false-positive rate, and F1 score was chosen according to the best Youden’s index. The diagnostic test performance difference between models was evaluated with McNemar’s test.
Results
The demographic data of the AXR data set are shown in Table 1. We obtained a total of 1351 pairs of AXR images to build the model. After completing two steps of the training process, the training accuracy was 0.9987 on the AP view data and 0.9972 on the Lat view data. The complete workflow of this study is shown in Figure 1. The accuracy and change loss during the training process is shown in Figure 2.
Figure 1.

The flowchart of this study. AP, anteroposterior.
Figure 2.
(a, b) Changes of accuracy and loss during the training process of the ankle AP model. The accuracy was stable after 45 epochs. (c, d) Changes of the loss during the training process of the ankle Lat model. The accuracy was stable after 40 epochs.
Each model was tested on the target view of the preserved test set containing 100 patient data and is presented in Table 2. Out of 50 fractures, 4 (8%) were bimalleolar fractures, 21 (42%) were lateral malleolus fractures, 10 (20%) were medial malleolus fractures, 2 (4%) were trimalleolar fractures, and the remaining 13 (26%) were of other types that were visible in the AXRs. The AP model obtained an AUROC of 0.890 (95% CI: 0.826–0.954) with an accuracy of 0.83, a sensitivity of 0.76, and a specificity of 0.90 on the cutoff point. The Lat model obtained an AUROC of 0.894 (95% CI: 0.831–0.956) with an accuracy of 0.85, a sensitivity of 0.84, and a specificity of 0.86 on the cut-off point. We ensemble the two models on a single patient’s paired ankle images. The “sum-up” method obtained an AUROC of 0.917 (95% CI: 0.863–0.972), which was better than those of the individual view models. The cut-off point model obtained accuracy, sensitivity, and specificity values of 0.87, 0.84, and 0.90, respectively. Figure 3 shows the AUROC of the individual and sum-up models on the test set. The “severance-OR” method had a higher sensitivity of 0.90 but low specificity of 0.78 with an accuracy of 0.84 if a detected ankle fracture is defined as positive on any view. On the other hand, for the “severance-Both” method, the sensitivity drops to 0.70 with a specificity of 0.94 and an accuracy of 0.82, where a detected fracture is defined as positive when both views detect a fracture. The diagnostic performance “sum-up” method had no significant difference with each single view model. The “sum-up,” “severance-Both,” and “severance-OR” models had significantly different performances from each other.
Table 2.
The performance of each model on the test set
| Individual view | Accuracy | Sensitivity | Specificity | PPV | F1 score |
|---|---|---|---|---|---|
| AP model | 0.83 | 0.76 | 0.90 | 0.88 | 0.816 |
| Lat model | 0.85 | 0.84 | 0.86 | 0.86 | 0.850 |
| Ensemble methods | Accuracy | Sensitivity | Specificity | PPV | F1 score |
|---|---|---|---|---|---|
| Sum-up | 0.87 | 0.84 | 0.90 | 0.89 | 0.864 |
| Severance-OR | 0.84 | 0.90 | 0.78 | 0.80 | 0.847 |
| Severance-Both | 0.82 | 0.70 | 0.94 | 0.92 | 0.795 |
AP, anteroposterior; Lat, Lateral; PPV, positive-predictive value.
Figure 3.
(a) The AUROC with CI (blue area) of the AP model on the AP view images. (b) The AUROC with CI (green area) of the Lat model on the lateral view images. (c) The AUROC with CI (pink area) of the sum-up model combines results of two views. AP, anteroposterior; AUROC, area under the receiver operating characteristic curve; CI, confidence interval; Lat, Lateral.
We performed the visualization using grad-CAM on the test set. The heatmaps highlight the fracture areas detected by the models of each view. However, not all fractures are precisely identified by the heatmaps. The visualization examples are shown in Figure 4 and Figure 5.
Figure 4.
(A1) The heatmap generated by grad-CAM highlighted the tiny fracture line of the lateral malleolar fracture in the green box on AP view. (A2) The same fracture line is more obvious on the lateral view and also identified by the Lat model. (B1) The AP model identified the fracture fragment on the distal end of the fibula. (B2) The Lat model didn’t identify any fracture. The fracture area is hidden by other bony structures which physicians cannot identify the fracture area either. AP, anteroposterior.
Figure 5.
(C1, C2) The Lat model identified the fracture line, but the AP model didn’t. After ensemble the results by the “sum-up” method, the prediction is still high enough beyond the threshold. (D1, D2) The model miss classified the foreign body as a fracture site resulting false-positive prediction. AP, anteroposterior.
Discussion
This result shows that the well-trained DCNN can be trained to identify ankle fractures using an image data set with high sensitivity (84%), excellent specificity (90%), and good accuracy (87%) (F1 score: 0.864). Ankle fractures are still encountered daily. Uneven or non-displaced ankle fractures are challenging to identify rapidly because of the limitations of the doctor’s experience and the image quality of AXR. With the assistance of the DCNN, we can detect ankle fractures immediately to assist primary physicians with decisions.
Ankle fracture detection is a promising target for deep learning approaches because of the availability of near-perfect ground-truth labels. Because of the weight-bearing nature of the region, patients who have clinically silent fractures rapidly progress to severe pain and immobility. Postponed ankle fracture management results in poor prognosis and even elevated morbidity later on. 25 Therefore, detecting an ankle fracture as soon as possible is critical to improving the prognosis and decreasing medical costs. Artificial intelligence and the automated detection of bone fractures have been well discussed especially with the paradigm shift of the deep learning method. 17,18,26,27 Various classification algorithms such as Inception V3, DenseNet 121, ResNet 152, VGG 16, ResNet 50, Xception, and other specially designed DCNNs were applied in the fracture detection in different body parts. There are no head-to-head compare studies, and the accuracy reported ranges from 96.9 to 83. 27 Although DCNNs have been used to predict surgical outcomes, there is limited evidence to support the use of AI in detecting ankle fractures. 28,29 A previous study showed the ability of AI to detect ankle fractures with DL; however, high computing power requirements and imperfect diagnostic accuracy are considerations for deployment in the clinical flow.
Furthermore, large numbers of medical images for training make it challenging for teams to duplicate the results of other studies. Our study simplified the concept and used a common pretraining weight from ImageNet and trained our model with different projections of the radiographs separately. In advance, we ensembled the bidirectional algorithms to enhance the diagnostic accuracy and used the knowledge of experts. In this method, we did not need much training data and heavy labeling work, which made it easier for others to duplicate our results with their data.
Kitamura et al used a similar approach to develop an algorithm. They included plain films of three directions (including the oblique view) to detect ankle fractures with excellent performance (accuracy of 81%). The performance of our study presented a satisfactory result after the ensemble algorithm. Preventing overfitting and a simplified model package are essential to consider while developing a medical image diagnostic aid protocol. Although there are some robust segmentation networks, 30,31 labeling lesions still need enormous work. We applied several image augmentation methods during the developing phase to enrich the variability of the training set. We used simplified plain images, which assisted us in not requiring extensive processing, lesion segmentation, or extraction of domain-specific visual features, which limit the computing power of server systems. Hospitals and outpatient clinics will not need to set up high-quality computers to apply the deep learning-based algorithm. Our research shows that despite the challenges specific to radiological image data, the development of adequate clean data sets is sufficient to achieve high-level automated performance with deep learning systems. We use a common benchmark classification algorithm that can be easily applied to automated machine learning software and use a simple ensemble technique to integrate information from different views and fit it for clinical application on a relatively small number of data sets. Following the concept, the researcher can easily reproduce the concept on other body parts, which require multiple views for diagnosis. Based on previous experience, 20 additional time savings for segmentation and labeling were achieved by applying the same training pipeline. We used adequate DCNNs trained in a semi-supervised manner on image-level labeled data sets to make processing clinical image data practical. Many simple tasks, such as anatomy localization, can be achieved with very high accuracy with only a modest commitment by experts. The iterative training process augments the data and automatically enforces the neural network learning of the positive areas without the details of an expert annotation, which vastly decreases the labeling time.
Another critical consideration for extremity films is that in clinics, multiple projections of a radiograph are necessary to detect a possible fracture because radiographs are two-dimensional representations of three-dimensional structures. The images rely on proper patient and X-ray beam positioning to prevent the superimposition of bones and other radiopaque structures. In a clinical situation, doctors use different image views to make diagnoses. In this thesis, we also combine the algorithm findings to achieve the best method to diagnose ankle fractures. By the ensemble method, we find that the sensitivity and specificity can arrive at an acceptable level by using the sum-up method. Compared with the AP or Lat model alone, the ensemble method offered an intrinsic perspective to assist a clinical doctor in diagnosing this common disorder.
Moreover, the different ensemble methods also provide a future direction for anatomical areas that require multiple-view evaluations. Unlike the majority voting method provided in the previous study, we provide more straightforward, clinical, reasonable, and adjustable methods that combine the information from multiple views. The “sum-up” method provides a balanced performance, the “severance-OR” method provides a high sensitivity and may be used as a screening tool, and the “severance-Both” method generates a high specificity prediction and may serve as an exclusion test.
Moreover, the different ensemble methods also provide a future direction for anatomical areas that require multiple view evaluation. Unlike the majority voting method provided in the previous study, we provide more straightforward, clinical, reasonable, and adjustable methods to ensemble the information from multiple views. The “sum-up” method provides a balanced performance, the “severance-OR” method provides a high sensitivity which may be used as a screening tool, and the “severance-Both” method generates a high specificity prediction which may serve as an exclusion test.
One challenge in AI for medical images is the “black box” mechanism. A model may use another part of an image rather than the accurate lesion site to produce an answer. Most deep learning works in the medical field use a cropped image to prevent this problem. 21,32 However, these cropping images are challenging to apply in clinical situations. Therefore, the visualization of features became a solution to realize the underlying mechanism of a DCNN. 33 In this study, we used the whole bidirectional AXR images for the training algorithm. After visualization by the grad-CAM method, the activation area consisting of the fracture site provided evidence that the model indeed recognized ankle fractures. Our previous study used a web-based system of hip fracture detection in the clinical setting with real-time inference, 23 resulting a satisfactory accuracy improvement of physicians. The AXR fracture detection can apply with a similar method and can show results to emergency physicians in a few seconds. Understanding what a deep neural network uses to make predictions is an active research topic in machine learning. In the future, based on the observations from this study, the development of similar high-performance algorithms for medical imaging using deep learning has two prerequisites. First, there must be a sizable developmental set with thousands of similar radiographs. Second, a collection of large non-labelled radiographs with similar characteristics (in this study, the radiographs of the bone structures are similar) can be used to pretrain the algorithm, and then a limited number of labeled images can be used for DCNN learning. After training, a multifunctional algorithm can be developed to for clinical usage, making it easy to apply DCNNs in the medical image field.
This study has several limitations. One fundamental limitation arises from the nature of DCNNs, were the neural network is provided with only an image and the associated diagnosis without explicit definitions of features. Because the DCNN “learns” the most predictive features, the algorithm might use previously unknown features or features ignored by humans. For instance, although this study showed a good visualization of the identified fractures, the exacted features are still unknown. The visualization ability of the heatmap produced by grad-CAM can only show the location of the abnormality grossly and not wholly consistent with specialists. In addition, the algorithm is specifically trained to discriminate between the healthy and fractured bone in the background of radiographs. However, the algorithm might be unable to identify other pathologic presentations. Additional oblique views were also used to check different perspectives of bony structures. 34 However, this film is not routinely used in Chang Gung Memorial Hospital or other studies 35 which limits the number of data in this category.
Furthermore, this study does not include the detection of other abnormalities in AXRs, which is relevant in routine diagnosis. Finally, although the results of this study are promising, integrating this automatic detection algorithm into clinical pathways is another challenge. Therefore, a randomized, prospective study should be conducted to prove the clinical impact of deep learning on ankle fractures.
Conclusions
Ankle fractures can be identified by a DCNN algorithm trained on an image-level labeled multiview AXR data set. The trade-off between sensitivity and specificity can be adjusted using different ensemble methods depending on the clinical purpose. Furthermore, our algorithm can localize a fracture site with high accuracy and can assist clinical physicians in identifying occult ankle fractures and managing these patients early to prevent further medical costs and impacts on their quality of life.
Footnotes
Acknowledgments: We thank Dr Shih-Ching Kang, F-P Kuo, C-J Chen, and H-F Tien, to establish the CGMH trauma registry databank, the Chang Gung Memorial Hospital, grant number CMRPG1K0091, CMRPG3L0211, CMRPG3J1121 and Ministry of science and technology MOST-110–2314-B-182A-0901- (NMRPG3L0321) support this study. The funding organizations played no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have no conflicts of interest to declare.
Funding: The supporting funding of this project is from the Chang Gung Memorial Hospital Research project CMRPG1K0091, CMRPG3L0211, CMRPG3J1121 and Ministry of science and technology MOST-110–2314-B-182A-0901- (NMRPG3L0321).
Ethics approval and consent to participate: The study was approved by the IRB Committee of the Chang Gung Memorial Hospital, The IRB number is 202002343B0. The informed consent was waived.
The authors Yu-Tung Wu and Chien-Hung Liao contributed equally to the work.
Author contributions: Conception and design: C-T C. and C-H L. Funding obtainment: C-T C. Provision of study data: C-P H., C-H. O. and Y-K K. Collection and assembly of data: N-Y. L., J-Y L. H-S L. and H-S L. Data analysis and interpretation: C-T C. J-Y L. and H-S L.. Manuscript writing: C-T C. C-P H. and H-W C.. Final review the manuscript : Y-T W. and C-H L. Final approval of the manuscript: C-H L.
Availability of data and material: Restrictions apply to the availability of the development and external test datasets, which were used with the permission of the participants for the present study. Anonymized data may be available for research purposes from the corresponding authors on request. The data are not publicly available due to institutional restrictions and patient privacy concerns. The code used in this study can be accessed at GitHub (https://github.com/houhsein/Ankle-Fractures-detection).Any additional information required to reanalyze the data reported in this paper is available upon request.
Contributors: Conception and design: C-T C. and C-H L. Funding obtainment: C-T C. Provision of study data: C-P H., C-H. O. and Y-K K. Collection and assembly of data: N-Y. L., J-Y L. H-S L. and H-S L. Data analysis and interpretation: C-T C. J-Y L. and H-S L.. Manuscript writing: C-T C. C-P H. and H-W C.. Final review the manuscript : Y-T W. and C-H L. Final approval of the manuscript: C-H L.
Availability of data and material: Restrictions apply to the availability of the development and external test data sets, which were used with the permission of the participants for the present study. Anonymized data may be available for research purposes from the corresponding authors on request. The data are not publicly available due to institutional restrictions and patient privacy concerns. The code used in this study can be accessed at GitHub ( https://github.com/houhsein/Ankle-Fractures-detection ).Any additional information required to reanalyze the data reported in this paper is available upon request.
Contributor Information
Chi-Tung Cheng, Email: atong89130@gmail.com.
Chih-Po Hsu, Email: chihpo1227@gmail.com.
Chun-Hsiang Ooyang, Email: detv090@gmail.com.
Chia-Yi Chou, Email: nancy9350521@gmail.com.
Nai-Yu Lin, Email: mp2225@cgmh.org.tw.
Jia-Yen Lin, Email: meditator184@gmail.com.
Yi-Kang Ku, Email: miyita2003@gmail.com.
Hou-Shian Lin, Email: jacky831006@gmail.com.
Shao-Ku Kao, Email: kaosk@mail.cgu.edu.tw.
Huan-Wu Chen, Email: b2401003@gmail.com.
Yu-Tung Wu, Email: overwinterwu@gmail.com.
Chien-Hung Liao, Email: surgymet@gmail.com.
REFERENCES
- 1. De Boer AS, Schepers T, Panneman MJM, Van Beeck EF, Van Lieshout EMM. Health care consumption and costs due to foot and ankle injuries in the netherlands, 1986-2010. BMC Musculoskelet Disord 2014; 15: 128. doi: 10.1186/1471-2474-15-128 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Davidovitch RI, Walsh M, Spitzer A, Egol KA. Functional outcome after operatively treated ankle fractures in the elderly. Foot Ankle Int 2009; 30: 728–33. doi: 10.3113/FAI.2009.0728 [DOI] [PubMed] [Google Scholar]
- 3. Beckenkamp PR, Lin C-W, Chagpar S, Herbert RD, van der Ploeg HP, Moseley AM. Prognosis of physical function following ankle fracture: a systematic review with meta-analysis. J Orthop Sports Phys Ther 2014; 44: 841–51. doi: 10.2519/jospt.2014.5199 [DOI] [PubMed] [Google Scholar]
- 4. Daly PJ, Fitzgerald RH, Melton LJ, Ilstrup DM. Epidemiology of ankle fractures in rochester, minnesota. Acta Orthop Scand 1987; 58: 539–44. doi: 10.3109/17453678709146395 [DOI] [PubMed] [Google Scholar]
- 5. Juto H, Nilsson H, Morberg P. Epidemiology of adult ankle fractures: 1756 cases identified in norrbotten county during 2009-2013 and classified according to AO/OTA. BMC Musculoskelet Disord 2018; 19(1): 441. doi: 10.1186/s12891-018-2326-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Court-Brown CM, McBirnie J, Wilson G. Adult ankle fractures -- an increasing problem? Acta Orthop Scand 1998; 69: 43–47. doi: 10.3109/17453679809002355 [DOI] [PubMed] [Google Scholar]
- 7. Willett K, Keene DJ, Mistry D, Nam J, Tutton E, Handley R, et al. Close contact casting vs surgery for initial treatment of unstable ankle fractures in older adults: a randomized clinical trial. JAMA 2016; 316: 1455–63. doi: 10.1001/jama.2016.14719 [DOI] [PubMed] [Google Scholar]
- 8. Jensen SL, Andresen BK, Mencke S, Nielsen PT. Epidemiology of ankle fractures. A prospective population-based study of 212 cases in aalborg, Denmark. Acta Orthop Scand 1998; 69: 48–50. doi: 10.3109/17453679809002356 [DOI] [PubMed] [Google Scholar]
- 9. Bielska IA, Wang X, Lee R, Johnson AP. The health economics of ankle and foot sprains and fractures: a systematic review of english-language published papers. part 2: the direct and indirect costs of injury. Foot (Edinb) 2019; 39: 115–21. doi: 10.1016/j.foot.2017.07.003 [DOI] [PubMed] [Google Scholar]
- 10. Mencio GA, Swiontkowski MF. Green’s Skeletal Trauma in Children E-Book. Elsevier Health Sciences; 2014, p.688. [Google Scholar]
- 11. Wu Y, Jiang H, Wang B, Miao W. Fracture of the lateral process of the talus in children: a kind of ankle injury with frequently missed diagnosis. J Pediatr Orthop 2016; 36: 289–93. doi: 10.1097/BPO.0000000000000437 [DOI] [PubMed] [Google Scholar]
- 12. York TJ, Jenkins PJ, Ireland AJ. Reporting discrepancy resolved by findings and time in 2947 emergency department ankle x-rays. Skeletal Radiol 2020; 49: 601–11. doi: 10.1007/s00256-019-03317-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Romero Lauro G, Cable W, Lesniak A, Tseytlin E, McHugh J, Parwani A, et al. Digital pathology consultations—a new era in digital imaging. Challenges and Practical Applications J Digit Imaging 2013; 26: 668–77. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Yang S, Yin B, Cao W, Feng C, Fan G, He S. Diagnostic accuracy of deep learning in orthopaedic fractures: a systematic review and meta-analysis. Clin Radiol 2020; 75: 713. doi: 10.1016/j.crad.2020.05.021 [DOI] [PubMed] [Google Scholar]
- 15. Kermany DS, Goldbaum M, Cai W, Valentim CCS, Liang H, Baxter SL, et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 2018; 172: 1122–31. doi: 10.1016/j.cell.2018.02.010 [DOI] [PubMed] [Google Scholar]
- 16. Prevedello LM, Erdal BS, Ryu JL, Little KJ, Demirer M, Qian S, et al. Automated critical test findings identification and online notification system using artificial intelligence in imaging. Radiology 2017; 285: 923–31. doi: 10.1148/radiol.2017162664 [DOI] [PubMed] [Google Scholar]
- 17. Cheng C-T, Wang Y, Chen H-W, Hsiao P-M, Yeh C-N, Hsieh C-H, et al. A scalable physician-level deep learning algorithm detects universal trauma on pelvic radiographs. Nat Commun 2021; 12(1): 1066. doi: 10.1038/s41467-021-21311-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Chung SW, Han SS, Lee JW, Oh K-S, Kim NR, Yoon JP, et al. Automated detection and classification of the proximal humerus fracture by using deep learning algorithm. Acta Orthop 2018; 89: 468–73. doi: 10.1080/17453674.2018.1453714 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Roy S, Meena T, Lim SJ. Demystifying supervised learning in healthcare 4.0: A new reality of transforming diagnostic medicine. Diagnostics (Basel) 2022; 12(10): 2549. doi: 10.3390/diagnostics12102549 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Cheng C-T, Ho T-Y, Lee T-Y, Chang C-C, Chou C-C, Chen C-C, et al. Application of a deep learning algorithm for detection and visualization of hip fractures on plain pelvic radiographs. Eur Radiol 2019; 29: 5469–77. doi: 10.1007/s00330-019-06167-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Gale W, Oakden-Rayner L, Carneiro G, Bradley AP, Palmer LJ. Detecting hip fractures with radiologist-level performance using deep neural networks. Internet. Available from: http://arxiv.org/abs/1711.06504
- 22. Lindsey R, Daluiski A, Chopra S, Lachapelle A, Mozer M, Sicular S, et al. Deep neural network improves fracture detection by clinicians. Proc Natl Acad Sci U S A 2018; 115: 11591–96. doi: 10.1073/pnas.1806905115 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Cheng C-T, Chen C-C, Cheng F-J, Chen H-W, Su Y-S, Yeh C-N, et al. A human-algorithm integration system for hip fracture detection on plain radiography: system development and validation study. JMIR Med Inform 2020; 8(11): e19416. doi: 10.2196/19416 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Chollet F. Xception: Deep Learning with Depthwise Separable Convolutions. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Honolulu, HI. ; 2017. pp. 1251–58. doi: 10.1109/CVPR.2017.195 [DOI] [Google Scholar]
- 25. Schepers T, De Vries MR, Van Lieshout EMM, Van der Elst M. The timing of ankle fracture surgery and the effect on infectious complications; a case series and systematic review of the literature. Int Orthop 2013; 37: 489–94. doi: 10.1007/s00264-012-1753-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Kim DH, MacKinnon T. Artificial intelligence in fracture detection: transfer learning from deep convolutional neural networks. Clin Radiol 2018; 73: 439–45. doi: 10.1016/j.crad.2017.11.015 [DOI] [PubMed] [Google Scholar]
- 27. Meena T, Roy S. Bone fracture detection using deep supervised learning from radiological images: A paradigm shift. Diagnostics (Basel) 2022; 12(10): 2420. doi: 10.3390/diagnostics12102420 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Kitamura G, Chung CY, Moore BE. Ankle fracture detection utilizing a convolutional neural network ensemble implemented with a small sample, de novo training, and multiview incorporation. J Digit Imaging 2019; 32: 672–77. doi: 10.1007/s10278-018-0167-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Olczak J, Emilson F, Razavian A, Antonsson T, Stark A, Gordon M. Ankle fracture classification using deep learning: automating detailed AO foundation/orthopedic trauma association (AO/OTA) 2018 malleolar fracture identification reaches a high degree of correct classification. Acta Orthop 2021; 92: 102–8. doi: 10.1080/17453674.2020.1837420 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Pal D, Reddy PB, Roy S. Attention UW-net: a fully connected model for automatic segmentation and annotation of chest X-ray. Comput Biol Med 2022; 150: 106083. doi: 10.1016/j.compbiomed.2022.106083 [DOI] [PubMed] [Google Scholar]
- 31. Gangopadhyay T, Halder S, Dasgupta P, Chatterjee K, Ganguly D, Sarkar S, et al. MTSE U-net: an architecture for segmentation, and prediction of fetal brain and gestational age from MRI of brain. Netw Model Anal Health Inform Bioinforma 2022; 11(1. doi: 10.1007/s13721-022-00394-y [DOI] [Google Scholar]
- 32. Liu F-Y, Chen C-C, Cheng C-T, Wu C-T, Hsu C-P, Fu C-Y, et al. Automatic hip detection in anteroposterior pelvic radiographs-A labelless practical framework. J Pers Med 2021; 11(6): 522. doi: 10.3390/jpm11060522 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Erhan D, Bengio Y, Courville A, Vincent P. Visualizing higher-layer features of a deep network. University of Montreal 2009; 3: 1341. [Google Scholar]
- 34. De Smet AA, Doherty MP, Norris MA, Hollister MC, Smith DL. Are oblique views needed for trauma radiography of the distal extremities? AJR Am J Roentgenol 1999; 172: 1561–65. doi: 10.2214/ajr.172.6.10350289 [DOI] [PubMed] [Google Scholar]
- 35. Brandser EA, Berbaum KS, Dorfman DD, Braksiek RJ, El-Khoury GY, Saltzman CL, et al. Contribution of individual projections alone and in combination for radiographic detection of ankle fractures. AJR Am J Roentgenol 2000; 174: 1691–97. doi: 10.2214/ajr.174.6.1741691 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Availability of data and material: Restrictions apply to the availability of the development and external test datasets, which were used with the permission of the participants for the present study. Anonymized data may be available for research purposes from the corresponding authors on request. The data are not publicly available due to institutional restrictions and patient privacy concerns. The code used in this study can be accessed at GitHub (https://github.com/houhsein/Ankle-Fractures-detection).Any additional information required to reanalyze the data reported in this paper is available upon request.




