Abstract
Digital radiography is one of the most common and cost-effective standards for the diagnosis of bone fractures. For such diagnoses expert intervention is required which is time-consuming and demands rigorous training. With the recent growth of computer vision algorithms, there is a surge of interest in computer-aided diagnosis. The development of algorithms demands large datasets with proper annotations. Existing X-Ray datasets are either small or lack proper annotation, which hinders the development of machine-learning algorithms and evaluation of the relative performance of algorithms for classification, localization, and segmentation. We present FracAtlas, a new dataset of X-Ray scans curated from the images collected from 3 major hospitals in Bangladesh. Our dataset includes 4,083 images that have been manually annotated for bone fracture classification, localization, and segmentation with the help of 2 expert radiologists and an orthopedist using the open-source labeling platform, makesense.ai. There are 717 images with 922 instances of fractures. Each of the fracture instances has its own mask and bounding box, whereas the scans also have global labels for classification tasks. We believe the dataset will be a valuable resource for researchers interested in developing and evaluating machine learning algorithms for bone fracture diagnosis.
Subject terms: Medical research, Diagnosis
Background & Summary
There has been a surge in demand for computer-aided diagnosis (CAD) systems in recent decades. Moreover, recently different fields of medical science have seen rapid development of automation processes in diagnosis leveraging large datasets and advanced machine learning algorithms1,2. Models like convolutional neural networks (CNN)3, You only look once (YOLO)4 and U-NET5 can achieve expert-like performance in detecting anomalies from X-Ray scans. Training such models requires large and well-annotated datasets6–8. It is difficult to collect such data from hospitals and diagnostic centers. The annotation process can be very costly as it requires the involvement of multiple physicians and radiologists for consensus to remove bias and human errors. Due to the sensitive nature of medical data, it is also very hard to make the acquired data available for public use. To sum it up the creation of such datasets is costly and time-consuming2,9.
Some of the well-known X-Ray datasets that are publicly available and that focus on anomalies include MURA10, MedPix11, GRAZPEDWRI-DX12, IIEST1, MOST13, VinDr-CXR2, VinDr-SpineXR14 and ChestX-ray1415. Among these datasets, MURA is a collection of 2D muscular skeletal radiographs with 40,561 images from different regions such as the elbow, finger, forearm, hand, humerus, shoulder, and wrist10. Each image is labeled as ‘Normal’ or ‘Abnormal’ which makes it suitable for classification tasks, however, it lacks proper annotation for localization and segmentation. MedPix is an online database of 2D and 3D medical scans of various diseases that can be filtered by the keyword ‘fracture’ resulting in 954 images. These images include X-rays, real images, Magnetic resonance imaging (MRI), Computed tomography (CT) scans, and ultrasound imaging. However, the dataset has issues such as unorganized annotation and falsely labeled images, as well as some spam images. GRAZPEDWRI-DX is a recently released dataset, containing 20,327 scans with annotation for localization collected from 6,091 patients. Though this is a suitably large dataset, it covers only wrist fractures, omitting the rest of the parts of the human body. IIEST is a small dataset of 2D X-rays containing 217 images, of which 49 are healthy, 99 are fractured and 69 are cancerous bone scans. This dataset is very small and inadequate for machine-learning activities. MOST is a dataset that contains 4,446 X-ray and MRI scans labeled by the Kellgren–Lawrence (KL) grading system16 having five classes from grade-0 to grade-4 with increasing severity from one to the next. This dataset is no longer available in the public domain due to lacking of funding and closeout. It also covers only knee joint fractures. VinDr-CXR is also a recently published dataset, which contains 18,000 images of chest X-rays (CXR) with manual annotation for localization. This dataset contains samples for 28 different types of chest diseases and abnormalities. Though this is a good dataset for identifying chest diseases, it’s not suitable for bone fracture identification. Likewise, the VinDr group has other datasets VinDr-Mammo17, VinDr-SpineXR14 and PediCXR18 which are not suitable for fracture study for similar reasons. ChestX-ray14 is a dataset for radio graphs containing 112,000 CXR scans. This dataset is also not suitable for bone fracture study as it only contains samples of chest diseases. Some prominent datasets of human body radiographs are compared with the FracAtlas dataset19 in Table 1.
Table 1.
Dataset | Release year | samples | Global labels | Local labels | Local mask | Multi-locale |
---|---|---|---|---|---|---|
MURA10 | 2017 | 40,561 | Available | N/A* | N/A | Yes |
MedPix11 †† | 2016 | 1,954 | Available | N/A | N/A | Yes |
GRAZPEDWRI-DX12 † | 2022 | 20,327 | Available | Available | N/A | No |
IIEST1 (Δ) | 2020 | 217 | Available | N/A | N/A | Yes |
MOST13 | 2020 | 4,446 | Available | N/A | N/A | Yes |
VinDr-CXR2 (•) | 2022 | 18,000 | Available | Available | N/A | No |
PediCXR17 (•) | 2023 | 9,125 | Available | Available | N/A | No |
ChestX-ray1415 (•) | 2017 | 112,120 | Available | N/A | N/A | No |
RSNA Pediatric26 (•) | 2017 | 14,236 | Available | N/A | N/A | No |
FracAtlas | 2023 | 4,083 | Available | Available | Available | Yes |
*Can be generated or interpreted from the given data. (†)Used a mixture of automated NLP tools and Manual labeling. (††)Annotation is fully automated without any manual validation. (Δ)Too small to be applicable for training deep learning models. (•)Not suitable for fracture study.
The shortcomings of the existing datasets are that most of them can only be used for classification tasks or they lack proper annotation. Some are also mislabeled and hence not suitable for machine learning tasks as they are not well maintained or the quality of annotation is not up to the mark. The available high-quality X-ray datasets are not intended for bone fracture study. With the recent advancements in CAD systems, datasets for only classification tasks are not enough as most are moving toward developing localization and segmentation models20. For such tasks, it is very important to have well-maintained and documented datasets with proper manual annotation. Due to the sensitive nature of the medical domain, it is very important for the models to perform at a high level. And to accomplish that, a large dataset with high-quality annotation is very important7.
Most of the prominent works on bone fracture classification, localization and segmentation have used private datasets21–23. Due to the unavailability of publicly accessible datasets currently, it is not feasible to conduct a comparative analysis of state-of-the-art (SOTA) methods. To solve this problem we introduce FracAtlas dataset19 which has been created by collecting 14,068 x-ray scans from three prominent hospitals in Bangladesh. From these 14,068 scans, 4,083 images have been isolated from regions like hand, shoulder, leg and hip. The rest of the scans were discarded as they were from the chest or skull region. Due to security and privacy concerns, we have anonymized all the patient-related structured data such as name, age, gender, time of diagnosis, etc. from each of the scans. The collected DICOM images have been converted to JPG format. The dataset can be accessed at figshare (10.6084/m9.figshare.22363012).
Methods
We have created the FracAtlas dataset19 in four main steps (1) Data Collection (2) data cleaning (3) finding the general distribution of cleaned data (4) annotation of the dataset. Throughout the years 2021 and 2022, approximately 14,068 X-ray scans were collected from 3 hospitals and diagnostic centers. Most of the scans were collected from Lab-Aid Medical Center, Brahmanbaria, along with Anupam General Hospital and Diagnostic Center, Bogra and Prime Diagnostic Center, Barishal. The acquired DICOM images were generated by Fujifilm and Philips devices. The complete process is illustrated in Fig. 1. The ethical clearance of this study was approved by Institutional Research Ethics Board (IREB) according to the Bangladesh Medical Research Council (BMRC). The IREB approved the open publication of the data based on the facts that there are adequate provisions to maintain the confidentiality of the individuals through proper filtration of personally identifiable information. Furthermore, the permission of publishing the data to the public domain was also taken at the source. Consent for data collection for all subjects (adults and parents in the case of minors) was taken as part of the initiation of the diagnosis at the medical facilities. Also, the data collection process had no effect on the clinical treatment or processes of diagnosis of the three hospitals involved and all personally identifiable information in the gathered data has been removed. The whole process was administered according to the Institutional Research Ethics Board of United International University.
Data collection and cleanup
In the initial phase, a total of 14,068 X-Rays were collected. As the hospitals and diagnostic centers could not share patient information due to privacy concerns, all the DICOM images were given an arbitrary image name and converted to JPG image format. This automatically got rid of all the sensitive information that was present in the metadata of DICOM images. These conversions were done using the proprietary software of the corresponding X-ray machines. The renaming process was automated using a Python script. The renamed DICOM images were stored in the hospital database separately for later study of general distribution. All the X-ray scans that have been collected are for general-purpose diagnosis. This means along with bone fracture scans there are also samples for chest diseases and abnormalities in the skull and spine region. In the collected data the number of bone fracture samples in the chest, skull and spine region was sparse. As a result, scans for the said parts were removed with the supervision of a medical officer. This left us with 4,083 scans from the hand, leg, hip and shoulder regions. Figure 2 shows some valid vs outlier images for the dataset. Some of the images in our dataset contain logos and texts which have not been removed.
Distribution analysis
After the cleanup process, the demographic analysis was done on the 4,083 images. In our study, we have observed that the age of the patients has a major impact in terms of fracture analysis. For younger subjects (0–7 years old) the ends of bones near the joints can look like separate disjoint disc pads due to lack of bone density. A model trained on patients above this said range can misjudge those bone structures as fractures. On the contrary, for older patients (above 50 years old) the surface of bones can look rough24. This can also lead a model to misjudge those bones as fractured ones. So, it is crucial that a dataset intended for fracture study contains a diverse range of patients’ ages. As all metadata of the X-Ray images were discarded at the time of collection. After dataset cleanup, the remaining images were taken back to the corresponding hospitals to find out the distribution of age and gender on the entire dataset. The age of subjects in our dataset ranges from 8 months to 78 years old. Also, the gender distribution for abnormal studies is 85.4% and 14.6% between males and females respectively. The gender ratio for the whole dataset (normal + abnormal cases) is 62% male and 38% female approximately. There are 717 abnormal scans in our dataset which contain a total of 922 instances of fractures. The abnormal studies contain at least 1 and at most 5 fracture instances in them. Some of the scans have multiple views and locales in them. The whole dataset contains 4,083 images and 4,497 locales. There are 396 images with different views of the same organ in the same image. There are 99 images with Orthopedic Fixation Devices (hardware) in them. The FracAtlas dataset19 has a total of 1,538 scans of the hand and among them, 437 are fractured. There is a total of 2,272 leg scans, 338 hip scans and 349 shoulder scans. Among these, the number of scans belonging fractured class is 263, 63 and 63 for the leg, hip and shoulder regions respectively. Figure 3 illustrates the distribution of different properties present in the dataset. The FracAtlas dataset comprises a total of 2,503 frontal, 1,492 lateral, and 418 oblique view images, each pertaining to different organs. Whereas the ‘Fractured’ class includes 438 frontal, 325 lateral, and 45 oblique view images. Conversely, the ‘Non-fractured’ class encompasses a total of 2,065 frontal, 1,167 lateral, and 373 oblique views. The relative distribution is illustrated in Fig. 4.
Data labeling
The distribution analysis of the data was followed by a review process by two expert radiologists, each with years of experience in the field. The radiologists went through all 4,083 images and labeled each image by identifying the presence and number of fractures, along with the location name of the fractures. After full observation, the fracture list generated by each radiologist was cross-checked with one another. The images that had unanimous labels provided by the radiologists were taken as fractured scans. In case of any disparities in the location of fractures or the count of fracture locales, the images were referred to an expert Orthopedic surgeon for further review and validation. After labeling those listed images independently, the images were again cross-checked with his own findings to the ones generated by the radiologists. And after comparing all 3 samples the final labels were agreed upon. After resolving all conflicts, the images were manually annotated using makesense.ai https://github.com/SkalskiP/make-sense. The primary type of annotation generated for bone fracture was in Common Objects in Context (COCO) format25. This format allows for the creation of polygon masks of the fracture regions. Each image can have multiple locales marked by separate masks and different masks are also allowed to overlap. The COCO JavaScript Object Notation (JSON) format was chosen to be worked on manually because it contains the most amount of information and allows conversions that are lossy to other annotation formats like YOLO annotation and Pascal VOC and also lossless ones like Visual Geometry Group (VGG) format. The COCO JSON format is used for segmentation tasks whereas YOLO and Pascal Visual Object Classes (VOC) are used for localization. The original record maintained for the primary labeling process is also provided for classification tasks in Comma Separated Values (CSV) format. Figure 5 shows different annotation types provided with the dataset.
Data Records
The FracAtlas dataset19 has been made available for public download through Figshare 10.6084/m9.figshare.22363012). The data can be downloaded without any need for registration. The total size of our dataset is 323 MB. The folder structure containing the dataset with all relevant files is described below.
Folder structure
The root folder for the dataset is named “FracAtlas”. It contains subfolders “images”, “Annotations”, “utilities” and a “dataset.csv” file. The CSV file has been generated during the data labeling process discussed in the methods section. Figure 6 gives an overview of the folder structure. The CSV contains columns representing whether a scan has “hand”, “leg”, “hip” or “shoulder” region present in it along with the information if the scan contains multiple regions in the scan. It also has a “hardware” column corresponding to the availability of Orthopedic Fixation Devices in the scan. Some X-ray scans have multiple views of the same organ projected from the frontal (Coronal) plane and Sagittal plane. Those images can be identified using the “multiscan” column in the CSV. The “fractured” column represents if a scan has fractures in it. All the column mentioned so far has binary value containing ‘0’ and ‘1’. The ‘0’ and ‘1’ represents a specific attribute being absent or present in that particular image respectively. The only exception to this is the “fracture_count” column which has numerical values from 0 and 5 representing the number of fracture instances present in that image. The “frontal”, “lateral” and “oblique” columns represent the perspectives present in a scan with values set to ‘1’ and ‘0’ otherwise.
The “images” folder has two subfolders named “Fractured” and “Non_fractured”. The “Fractured” Folder contains all the images containing fractures in them. Whereas the Folder “Non_fractured” contains all the healthy bone radiographs. All the scan names start with “IMG” followed by zero padded seven-digit values which are unique to each image ending with “.jpg” signifying the datatype. The folder named “Annotation” comprises various annotation files for segmentation and localization purposes. For segmentation, there are two subfolders named “COCO JSON” and “VGG JSON”, which contain corresponding annotation types in the “.json” format. These files contain annotations only for images that have fractures. For localization, there are two subfolders named “YOLO” and “PASCAL VOC” (PASCAL Visual Object Classes) containing “.txt” and “.xml” files, respectively, named after the corresponding image files. Additionally, the “YOLO” folder has a “classes.txt” file that lists the available classes for localization, and in this case, there is only one class named “fractured.”
The “Utilities” folder contains several notebooks used in the preparation of the dataset. As the manual annotations were done in COCO JSON format, the YOLO annotations were generated from the COCO masks using “coco2yolo.ipynb” and later PASCAL VOC annotations were generated from the YOLO annotations using “yolo2voc.ipnyb”. Under the subfolder “Fracture Split” there are 3 CSV files titled “test.csv”, “train.csv” and ‘valid.csv”. Each of these files contains a list of images used for testing, training and validation in the technical validation of the dataset respectively.
Technical Validation
All the images in the dataset were manually examined to make sure no individually identifiable information is attached or embedded in the dataset images. After the annotation process of the dataset, all the masks generated were reviewed by the medical officer. To make sure the dataset is suitable for training machine learning algorithms we trained both fracture localization and segmentation using YOLOv8s and YOLOv8s-seg respectively.
The fractured images were randomly split into 80% (574) training, 12% (82) validation and 8% (61) test images for training and testing both the localization and segmentation models. The training was done on a Windows laptop equipped with an Nvidia RTX 3070 GPU with 8GB video memory and an AMD Ryzen 5900HX processor. Both the models were pre-trained with COCO40 and ran for 30 epochs. The input size for both cases was 600 pixels with standard hyperparameters. Table 2 lists the relative performance across different tasks.
Table 2.
Task | Model | Type | Precision | Recall | mAP@0.5 |
---|---|---|---|---|---|
Localization | YOLO8s | Box | 0.807 | 0.473 | 0.562 |
Segmentation | YOLO8s-seg | Box | 0.718 | 0.607 | 0.627 |
Mask | 0.83 | 0.499 | 0.589 |
Object detection performance
For the localization task, the fractures were detected with a box precision of 80.7%, recall of 47.3% and an mAP of 56.2% at IoU of the 50th percentile on the validation set.
Segmentation performance
For the segmentation task, the fractures were detected with a box precision of 71.8%, recall of 60.7% and an mAP of 62.7% at IoU of 0.5 on the validation set. As for the mask, the precision is 83%, recall 49.9% and mAP50 of 58.9%.
Usage Notes
The dataset FracAtlas19 is made freely available for any purpose. The data provided within this work are free to copy, share or redistribute in any medium or format. The data might be adapted, remixed, transformed, and built upon. The dataset is licensed under a Creative Commons “Attribution 4.0 International” license (https://creativecommons.org/licenses/by/4.0/).
Additionally, any publication that utilizes this resource are requested to cite the original paper, and the authors are encouraged to share their code and models to help the research community reproduce the experiments and advance the field of medical imaging.
Acknowledgements
We extend our gratitude to Lab-Aid Medical Center, Anupam General Hospital, and Prime Diagnostic Center for granting us permission to access their image databases and for allowing us to release the FracAtlas dataset19 to the public. We express our sincere appreciation to Dr. Asaduzzaman Bhuiyan for overseeing the data collection, de-identification, and dataset clean-up procedures. We are also grateful to Dr. Tonmoy Jamshed Alam for his valuable contributions to reviewing and validating the data labels. We would like to thank Effat Jahan for her valuable assistance in the data collection process. We acknowledge the significant contributions made by our collaborators, including radiologists, physicians, and technicians, who assisted with the data collection and labeling process. This research is partially supported by the Institute for Advanced Research (IAR) Publication Grant of United International University (UIU), Dhaka, Bangladesh, through the Research Publication Grant number: IAR-2023-Pub-014 and the Islamic University of Technology (IUT), Gazipur, Bangladesh.
Author contributions
I.A. generated the database and retrieved the majority of images from LAMC. M.A.R. collected images from PDC and AGH. I.A. cleaned the data. I.A., F.Z.P. helped in the distribution analysis of the database. I.A. annotated the images from the labels provided by the radiologists. M.A.R. and I.A. did the technical analysis of the dataset. I.A. drafted the paper. T.A., T.M.C. and S.S. supervised the project and participated actively in manuscript preparation.
Code availability
The conversion of DICOM to JPEG image format was done using proprietary software of the X-ray machines from brands like Fujifilm and Philips hence they could not be made available. The mask annotations for segmentation were done using an open-source web tool named makedsense.ai. It was also used for generating VGG annotations from COCO format. As explained in the Methods section, the annotation conversion procedures from COCO to YOLO and YOLO to PASCAL VOC were performed using Python 3.10.1 on a Windows 11 operating system using ‘coco2yolo.ipynb’ and ‘yolo2voc.ipynb’. Both the Jupyter notebooks can be found inside the ‘Utility’ folder along with the dataset at Figshare (10.6084/m9.figshare.22363012). The code used for technical validation can be accessed from (https://github.com/XLR8-07/FracAtlas). There are 2 notebooks inside ‘notebooks’ under the root folder called ‘Train_8s.ipynb’ and ‘Prediction_8s.ipynb’. The ‘Train_8s.ipynb’ is used to train 2 models of ‘YOLO8s_seg’ and ‘YOLO8s’ variants targeted toward segmentation and localization tasks respectively. ‘Prediction_8s.ipynb’ is used to generate predictions out of the 2 aforementioned models and view the results.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Iftekharul Abedeen, Md. Ashiqur Rahman.
References
- 1.Yadav, D. & Rathor, S. Bone fracture detection and classification using deep learning approach. In 2020 International Conference on Power Electronics & IoT Applications in Renewable Energy and its Control (PARC), 282–285 (IEEE, 2020).
- 2.Nguyen HQ, et al. Vindr-cxr: An open dataset of chest x-rays with radiologist’s annotations. Scientific Data. 2022;9:429. doi: 10.1038/s41597-022-01498-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.LeCun, Y., et al. Deep learning. nature, 521 (7553), 436–444. Google Scholar Google Scholar Cross Ref Cross Ref 25 (2015). [DOI] [PubMed]
- 4.Redmon, J., Divvala, S., Girshick, R. & Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, 779–788 (2016).
- 5.Ronneberger, O., Fischer, P. & Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5–9, 2015, Proceedings, Part III 18, 234–241 (Springer, 2015).
- 6.Althnian A, et al. Impact of dataset size on classification performance: an empirical evaluation in the medical domain. Applied Sciences. 2021;11:796. doi: 10.3390/app11020796. [DOI] [Google Scholar]
- 7.Barbedo JGA. Impact of dataset size and variety on the effectiveness of deep learning and transfer learning for plant disease classification. Computers and electronics in agriculture. 2018;153:46–53. doi: 10.1016/j.compag.2018.08.013. [DOI] [Google Scholar]
- 8.Soekhoe, D., Van Der Putten, P. & Plaat, A. On the impact of data set size in transfer learning using deep neural networks. In Advances in Intelligent Data Analysis XV: 15th International Symposium, IDA 2016, Stockholm, Sweden, October 13–15, 2016, Proceedings 15, 50–60 (Springer, 2016).
- 9.Razzak, M. I., Naz, S. & Zaib, A. Deep learning for medical image processing: Overview, challenges and the future. Classification in BioApps: Automation of Decision Making 323–350 (2018).
- 10.Rajpurkar, P. et al. Mura: Large dataset for abnormality detection in musculoskeletal radiographs. arXiv preprint arXiv:1712.06957 (2017).
- 11.NLM, N. Medpix, u.s. national library of medicine. national institutes of health. MedPixhttps://medpix.nlm.nih.gov/home.
- 12.Nagy E, Janisch M, Hržić F, Sorantin E, Tschauner S. A pediatric wrist trauma x-ray dataset (grazpedwri-dx) for machine learning. Scientific Data. 2022;9:222. doi: 10.1038/s41597-022-01328-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Sheehy L, et al. Does measurement of the anatomic axis consistently predict hip-knee-ankle angle (hka) for knee alignment studies in osteoarthritis? analysis of long limb radiographs from the multicenter osteoarthritis (most) study. Osteoarthritis and cartilage. 2011;19:58–64. doi: 10.1016/j.joca.2010.09.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Nguyen, H. T. et al. Vindr-spinexr: A deep learning framework for spinal lesions detection and classification from radiographs (2021).
- 15.Wang, X. et al. Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2097–2106 (2017).
- 16.Park H-J, et al. A practical mri grading system for osteoarthritis of the knee: association with kellgren–lawrence radiographic scores. European journal of radiology. 2013;82:112–117. doi: 10.1016/j.ejrad.2012.02.023. [DOI] [PubMed] [Google Scholar]
- 17.Pham HH, Nguyen NH, Tran TT, Nguyen TN, Nguyen HQ. Pedicxr: An open, large-scale chest radiograph dataset for interpretation of common thoracic diseases in children. Scientific Data. 2023;10:240. doi: 10.1038/s41597-023-02102-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Nguyen HT, et al. Vindr-mammo: A large-scale benchmark dataset for computer-aided diagnosis in full-field digital mammography. Scientific Data. 2023;10:277. doi: 10.1038/s41597-023-02100-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Abedeen I, 2023. Fracatlas: A dataset for fracture classification, localization and segmentation of musculoskeletal radiographs. figshare. [DOI] [PMC free article] [PubMed]
- 20.Kar MK, Nath MK, Neog DR. A review on progress in semantic image segmentation and its application to medical images. SN computer science. 2021;2:397. doi: 10.1007/s42979-021-00784-5. [DOI] [Google Scholar]
- 21.Thian YL, et al. Convolutional neural networks for automated fracture detection and localization on wrist radiographs. Radiology: Artificial Intelligence. 2019;1:e180001. doi: 10.1148/ryai.2019180001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Raisuddin AM, et al. Critical evaluation of deep neural networks for wrist fracture detection. Scientific reports. 2021;11:1–11. doi: 10.1038/s41598-021-85570-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Xue L, et al. Detection and localization of hand fractures based on ga_faster r-cnn. Alexandria Engineering Journal. 2021;60:4555–4562. doi: 10.1016/j.aej.2021.03.005. [DOI] [Google Scholar]
- 24.Parfitt AM. Bone age, mineral density, and fatigue damage. Calcified tissue international. 1993;53:S82–S86. doi: 10.1007/BF01673408. [DOI] [PubMed] [Google Scholar]
- 25.Lin, T.-Y. et al. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6–12, 2014, Proceedings, Part V 13, 740–755 (Springer, 2014).
- 26.Halabi SS, et al. The rsna pediatric bone age machine learning challenge. Radiology. 2019;290:498–503. doi: 10.1148/radiol.2018180736. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Citations
- Abedeen I, 2023. Fracatlas: A dataset for fracture classification, localization and segmentation of musculoskeletal radiographs. figshare. [DOI] [PMC free article] [PubMed]
Data Availability Statement
The conversion of DICOM to JPEG image format was done using proprietary software of the X-ray machines from brands like Fujifilm and Philips hence they could not be made available. The mask annotations for segmentation were done using an open-source web tool named makedsense.ai. It was also used for generating VGG annotations from COCO format. As explained in the Methods section, the annotation conversion procedures from COCO to YOLO and YOLO to PASCAL VOC were performed using Python 3.10.1 on a Windows 11 operating system using ‘coco2yolo.ipynb’ and ‘yolo2voc.ipynb’. Both the Jupyter notebooks can be found inside the ‘Utility’ folder along with the dataset at Figshare (10.6084/m9.figshare.22363012). The code used for technical validation can be accessed from (https://github.com/XLR8-07/FracAtlas). There are 2 notebooks inside ‘notebooks’ under the root folder called ‘Train_8s.ipynb’ and ‘Prediction_8s.ipynb’. The ‘Train_8s.ipynb’ is used to train 2 models of ‘YOLO8s_seg’ and ‘YOLO8s’ variants targeted toward segmentation and localization tasks respectively. ‘Prediction_8s.ipynb’ is used to generate predictions out of the 2 aforementioned models and view the results.