Abstract
Injured extremities commonly need to be immobilized by casts to allow proper healing. We propose a method to suppress cast superimpositions in pediatric wrist radiographs based on the cycle generative adversarial network (CycleGAN) model. We retrospectively reviewed unpaired pediatric wrist radiographs (n = 9672) and sampled them into 2 equal groups, with and without cast. The test subset consisted of 718 radiographs with cast. We evaluated different quadratic input sizes (256, 512, and 1024 pixels) for U-Net and ResNet-based CycleGAN architectures in cast suppression, quantitatively and qualitatively. The mean age was 11 ± 3 years in images containing cast (n = 4836), and 11 ± 4 years in castless samples (n = 4836). A total of 5956 X-rays had been done in males and 3716 in females. A U-Net 512 CycleGAN performed best (P ≤ .001). CycleGAN models successfully suppressed casts in pediatric wrist radiographs, allowing the development of a related software tool for radiology image viewers.
Keywords: artificial intelligence, diagnostic imaging, radiography, wrist, child
HIGHLIGHTS
Plaster casts in pediatric wrist radiographs can be successfully suppressed through generative adversarial network (GAN) computer vision algorithms.
CycleGAN with U-Net architecture and a square input size of 512 pixels achieved significantly best cast suppression in our sample dataset containing pediatric wrist radiographs.
CycleGAN-based cast suppression could be implemented as a tool in radiology software, enabling to dynamically blend between original and cast-suppressed versions.
INTRODUCTION
Injured extremities commonly need to be immobilized by plaster casts or splints.1 It might not be suitable or possible to remove a cast before acquiring radiographs, especially when assessing alignment or consolidation in the follow-up of fractures. Unfortunately, radiographs with present casts impair bone visibility due to increased thickness and density of the examined region. To date, there are no reports about cast suppression methods in extremity X-rays.
Artificial intelligence opened a variety of possible solutions for problems in the field of image processing, manipulation, and enhancement. Different generative models were successfully implemented in image-to-image translation tasks in computed tomography2 and magnetic resonance imaging,3–5 specifically variational autoencoders6 and generative adversarial networks (GAN).7 Variational autoencoders are simple to train but produce blurry images that lack details.8–10 GANs usually generate sharper images but face challenges in training stability and sampling diversity, especially when synthesizing high-resolution images.6 CycleGAN is a GAN type able to achieve image-to-image translation in unpaired samples.11 One of the recent uses of CycleGAN in medicine was bone suppression on chest radiograms by Liang et al.12
The contributions of this paper are as follows:
We are the first to tackle and offer a solution for cast suppression in X-ray images.
We evaluated the CycleGAN method using rigorous qualitative and quantitative analyses. The process of evaluation yielded the optimal CycleGAN model for the given problem.
We propose a ready-to-use software implementation for cast suppression based on the proposed method.
MATERIALS AND METHODS
The ethics committee of the Medical University of Graz (IRB00002556) approved this retrospective study, waiving the requirement for informed consent. To make the presented work easier to follow, we provide a diagram consisting of 3 major parts in Figure 1 that will be explained in detail in subsequent sections.
Figure 1.
Illustration of our research: subfigure (A) is representing the input data. Subfigure (B) shows the CycleGAN architecture: generator G is removing cast on the images, while generator F is applying cast from the images. Discriminator Dx distinguishes between fake images with cast and original images with cast, while the discriminator Dy is doing the same for images without having a cast applied. Subfigure (C) depicts the applications of the trained generators G and F.
Dataset
The images were sampled from a pediatric wrist digital radiography dataset of the Medical University of Graz, containing 20 330 images acquired between 2008 and 2018. We selected 9672 images free of radio-opaque foreign materials, split into images with and without cast in equal shares (n = 4836 per group). The mean age was 11 ± 3 years in radiographs containing cast and 11 ± 4 years in castless samples. A total of 5956 X-rays had been done in boys (mean age 11 ± 4 years) and 3716 in girls (mean age 10 ± 3 years). Images were processed as 8-bit grayscale portable network graphic files. Preprocessing included padding of nonsquare images with black pixels, followed by re-scaling to 1024 × 1024 pixels with Lanczos interpolation.13 The fingers invariably needed to point to the upper image border. We stratified the radiographs into subsets: train subset (n = 7600), validation subset (n = 636), and test subset (n = 1436). The castless half (n = 718) of the test set served as a reference in quantitative analyses, while the other half containing cast (n = 718) was used in testing cast suppression performance. Having an even number of images with and without the cast is beneficial for the CycleGAN training (it makes it easier to build a dataset and helps with the understanding and debugging of the training process). Example images from the training subset are shown in Figure 1A.
CycleGAN
Our dataset did not contain pairs of identical radiographs (with and without cast), as they cannot be reliably obtained due to variations in positioning a wrist and/or the X-ray equipment. Therefore, we favored CycleGAN11 as a qualified method for unpaired data over DualGAN14 or HarmonicGAN,15 which are complex and can be difficult to train.16 Moreover, because we are interested in creating precisely one output image, we favored CycleGAN over the MUNIT method that uses latent space to generate new samples in the target domain.17,18
CycleGAN is an unsupervised approach for learning the translation of images from the source domain (in our case, images with cast) to the destination domain (images without cast), and vice versa:
mapping G from X to Y, such that the distributions attained through generator G(X) are impossible to distinguish from the ones in Y by the discriminator Dy, and
mapping F from Y to X, such that the distributions attained through generator F(Y) are impossible to distinguish from the ones in X by the discriminator Dx.
Therefore, CycleGAN consists of 4 neural networks: 2 generators neural networks ( and ) and 2 discriminators neural networks ( and ) depicted in Figure 1B.
Model training
The learning process for generators and is to maximize the probability that discriminators and produce errors. In our case, the generator aims to remove a cast from the input, while the discriminator must distinguish real images without cast from generated fakes. To achieve these goals, CycleGAN is trained to optimize a complex loss function that includes both generators and discriminators, composed of:
Adversarial loss is calculated for each of the transformations: domain. It forces generators to produce realistic images in the target domain to fool discriminators.
- Cycle-consistency loss is solving the problem of vanishing fractures (or any other detail), requiring that the image must remain the same when both generators are applied.
Identity loss is forcing to keep the input unchanged, or nearly unchanged if it already belongs to the target domain of the generator.17
The generatormodels were a ResNet-based topology, alike Johnson et al19 (as in the original CycleGAN paper), and a U-Net-based topology as proposed by Ronneberger et al20 (as in various implementations of CycleGAN in numerous papers). Minor adjustments were necessary to account for the different quadratic input sizes of 256, 512, and 1024 pixels:
ResNet 256—composed of network base with 2 downsampling layers and 9 residual blocks and head with 2 upsampling layers.
ResNet 512—composed of the network base, followed by 12 residual blocks from; and the network head.
ResNet 1024—could not be trained due to computational constraints.
U-Net 256—the number of downsampling and upsampling layers is 8.
U-Net 512—the model architecture is similar to U-Net 256, with the number of downsampling and upsampling layers is set to 9.
U-Net 1024—the model architecture is similar to U-Net 256, but the number of downsampling and upsampling layers is set to 10.
The discriminators were constant for all trained models by means of PatchGANs, classifying overlapping 70 × 70 pixel pieces as real or fake.21 These PatchGANs allow to apply different patterns based on the different subregions instead of a whole image.7 As for the training hyperparameters, they are similar to the ones used in the original paper:11
Cycle-consistency loss influence ,
Adam optimizer using batch size equal to 1,
Learning rate , constant for the first 100 epochs and then linearly decreased to 0 for the consequent 100 epochs.
Model evaluation
Quantitative assessment
Casts influenced histograms across the whole pixel intensity spectrum in terms of intensity count increases in the higher range. We compared similarities of the average reference histogram that was calculated on the castless test subset images with the generated cast-suppressed histograms of every image containing cast in the test subset, by comparing the 2 functions, and , using the following metrics:
- Correlation can range from 1 (perfect positive correlation) to −1 (perfect negative correlation), the first representing the best histogram similarity22:
- Chi-squared distance represents the deviation of observed and expected frequencies. Smaller Chi-squared distances resemble more equal histograms25:
- Hellinger distance is commonly used for similarity quantification of 2 given distributions, defined as26:
The peak signal-to-noise ratio metric, commonly used to measure noise removal, is not applicable to cast suppression. Namely, we do not possess pairs of images of the same study, which are necessary for peak signal-to-noise ratio evaluation (one image with cast and one image without a cast of the same patient/case study). It needs to be noted that the average reference histogram that was calculated on the castless test subset of images had a similar mean and variance of the pixels’ intensity as the training subset which contained a much larger number of images. Hence, the average reference histogram used is representative and can be considered as a ground truth. Furthermore, to minimize the information loss, we have tested a second approach where we compared generated cast-suppressed histograms against every original castless image histogram in the test set. This approach required much more effort but produced similar results to the ones presented in the paper.
Qualitative assessment
Three radiologists with 8, 3 (in training), and 9 years of experience in musculoskeletal radiology independently ranked the generated images from best (1st) to worst (5th) subjectively on 20 image subsets. Each subset contained the original reference image and the 5 model-generated images, randomly named, shuffled, and scaled to 512 × 512 pixels. Within each set, rank sums served as surrogate parameter for model performances, calculated by multiplying rank and occurrence numbers, where lower scores correspond to higher-ranked models.
Statistical analysis
We used descriptive statistics and comparisons of means to analyze quantitative and qualitative results, computed in SPSS Statistics Version 21 (IBM Corp., Armonk, NY, USA). Quantitative metrics were compared with paired samples t tests, subjective image quality rankings with nonparametric Wilcoxon signed-rank tests for paired samples. P-values below .05 were considered statistically significant. It is important to emphasize that each cast-suppressed image histogram was evaluated separately against the averaged histogram of the castless images in the test subset, creating an experiment that can be treated with a paired samples test.
RESULTS
Quantitative assessment
U-Net 512 was the best-performing model in quantitative tests with a correlation of 0.998 ± 0.005, a histogram intersection of 222 503 ± 16 322, a Chi-square distance of 59 451 ± 55 793, and a Hellinger distance of 0.147 ± 0.043. The other models yielded mixed results: U-Net 256 was worst in all metrics, apart from intersection. All differences between the models among each other were statistically significant at the P < .001 level in paired samples t tests, except Chi-square distance between ResNet-256 and ResNet-512 (P = .072), and histogram correlation between ResNet-256 and U-Net 1024 (P = .745). Quantitative results and details are presented in Table 1 and Figure 2.
Table 1.
Comparison of the averaged pixel intensity histograms of the castless test set (reference), compared with the averaged cast-suppressed histograms
| Parameter | ResNet-256 | ResNet-512 | U-Net 256 | U-Net 512 | U-Net 1024 | |
|---|---|---|---|---|---|---|
| Valid samples | N | 718 | 718 | 718 | 718 | 718 |
| Histogram correlation | Mean ±SD | 0.988 | 0.993 | 0.963 | 0.998 | 0.988 |
| ±0.021 | ±0.015 | ±0.062 | ±0.005 | ±0.013 | ||
| Histogram intersection | Mean ±SD | 214 902 | 207 977 | 209 344 | 222 503 | 212 718 |
| ±22.391 | ±20 316 | ±26 658 | ±16 322 | ±20 778 | ||
| Chi-square distance | Mean ±SD | 122 012 | 128 185 | 271 652 | 59 451 | 165 814 |
| ±132 038 | ±111 497 | ±348 462 | ±55 793 | ±161 269 | ||
| Hellinger distance | Mean ±SD | 0.173 | 0.199 | 0.198 | 0.147 | 0.188 |
| ±0.061 | ±0.052 | ±0.080 | ±0.043 | ±0.052 |
Note: The best results are given in bold.
Figure 2.
Bar charts displaying the quantitative metrics: (A) Histogram correlation and (B) intersection, (C) Chi-square distance, and (D) Hellinger distance. 95% error indicators are presented. All differences between the models were statistically significant at the P < .001 level, apart from the 2 cases indicated (n.s.: not significant).
Qualitative assessment
U-Net 512 was significantly better ranked than the other models with 27 points median (ResNet-256 and ResNet-512 P < .001, U-Net 256 and 1024 P = .001), in contrast to ResNet-512 with 92 median rating points as the worst model. U-Net 256 and 1024 were regarded equivalent (P = .732). Radiologists’ rankings including cross significances are summarized in Table 2.
Table 2.
Subjective image quality rankings by the 3 radiologists
| Rank sums | ResNet-256 | ResNet-512 | U-Net 256 | U-Net 512 | U-Net 1024 |
|---|---|---|---|---|---|
| Rater 1 | 74 | 82 | 60 | 27 | 57 |
| Rater 2 | 67 | 95 | 60 | 32 | 46 |
| Rater 3 | 71 | 93 | 45 | 33 | 58 |
| Raters (median) | 71 | 92 | 52 | 27 | 53 |
| Significance (P), Wilcoxon signed-rank test | |||||
| ResNet-256 | - | .014* | .005** | <.001*** | .015* |
| ResNet-512 | .014* | - | <.001*** | <.001*** | <.001*** |
| U-Net 256 | .005** | <.001*** | - | .001** | .732 |
| U-Net 512 | <.001*** | <.001*** | .001** | - | .001** |
| U-Net 1024 | .015* | <.001*** | .732 | .001** | - |
Note: Rank sums are given in the top section (color-coded from better [green] to worse [yellow]), cross significances between the models in nonparametric Wilcoxon signed-rank tests are shown at the bottom.
P < .001
P < .01
P < .05.
DISCUSSION
We report a method to lessen superimpositions of plaster casts in pediatric wrist radiographs. CycleGAN models were able to suppress casts with satisfactory performance in quantitative and qualitative metrics. The U-Net 512 CycleGAN algorithm performed best, indicating that an actual cast suppression software might be possible in the future.
Our literature research did not identify any other published study on the topic of cast suppression in radiographs, neither in children nor in adults. There might be distinctions in cast suppression performances between children, adults, or elderly patients due to inherit differences in their bone configurations, mineral densities, and variabilities in age-related fracture mechanisms. We believe that robust algorithms could be developed, given sufficient input data, but it might be necessary to train different algorithms for different age groups to achieve solid outputs.
To demonstrate the benefits of cast suppression, we created a demo application as Jupyter notebook, accessible online (https://github.com/fhrzic/CastTool). In Figure 1C, we show only a small portion of this application, which represents the general concept. The application is able to linearly blend between original and inferred cast-suppressed images by means of a slider input. This way, a user retains full control to adjust and optimize his or her subjective level of cast suppression on-the-fly. We envision similar tools to be implemented into a Picture Archiving and Communication System (PACS) software as a future option. U-Net 512 was the overall best-performing CycleGAN model, ranking first in both histogram comparisons, and radiologists’ evaluations. U-Net 256 and 1024 followed in expert ratings but were ranked 3rd and 4th in the histogram evaluations. We consider the experts’ ratings superior to quantitative analyses, because retaining the original image impression is the key requirement in the current application. In this regard, ResNet models produced more synthetic-looking images that were prone to exhibit artifacts (Figure 3). More specifically, ResNet models invariably left areas of smudging or blurring back that were compromising the outputs. While there is a need for optimization and research on the available CycleGAN models, the whole field is evolving rapidly and models are subject to change. GAN improvements should be monitored closely, because novel algorithms might achieve better output images. Figure 3 demonstrates rare cases where models left parts of the cast unremoved in terms of artifacts in the image. Nonetheless, there would still be a cast suppression effect.
Figure 3.
A mosaic representing several input images sampled from the test set and the associated outputs generated by the models under consideration. We have sampled images in a way that we display the full potential of the models, as well as cases where the models have difficulties with cast removal (part of the cast would still be present in the images as an artifact).
The current manuscript dealt with cast suppression, while GANs could also perform image translations in the reverse direction by adding casts to castless radiographs (Figure 1C). At first glance, superimposing cast might not be a rational use. However, the training of certain neural networks might potentially profit from cast-augmentation.
A limitation of the proposed method is the fixed input size, which implies that images with better resolution need down-scaling to match the GAN requirements. As a consequence, variable amounts of information will be lost in high-resolution images based on the chosen model. Another limitation is that the generated images are indeed high-fidelity realistic versions of the expected content—which are not necessarily true to what the radiograph would capture if the image was taken in a wrist without cast. This could be problematic for input images that are substantially different in the content used for training. We believe that this issue can be circumvented to some extent by utilizing more training samples derived from diverse data sources and institutions. It also needs to be mentioned that images pass through the CycleGAN model, which requires performance-capable hardware including a graphics processing unit with at least 8 GB of memory. Otherwise, inference times would be unacceptably long. This manuscript also did not specifically assess the utility of cast suppression on parameters like depiction of alignment or consolidation. Further prospective studies need to be planned assessing the potential benefits of the proposed method.
In conclusion, cast suppression was successfully performed in radiographs for the first time. We employed CycleGANs on a pediatric wrist trauma dataset and achieved promising results with the U-Net architecture. Although further improvements and validation steps are needed, the proposed cast suppression method has great potential to be implemented as a helpful tool in radiology practice.
FUNDING
This work has been supported in part by the Croatian Science Foundation under the project IP-2020-02-3770 and by the University of Rijeka under the project uniri-tehnic-18-15.
AUTHOR CONTRIBUTIONS
FH, ST, and IŠ contributed to the study concept. Data acquisition was performed by FH, IŽ, and ST. Data analyses and visualization were conducted by FH and ST. FH, IŽ, and ST prepared the manuscript draft. FH, IŽ, ST, and IŠ contributed to reviewing and editing of the manuscript. IŠ supervised the project.
DATA AVAILABILITY
A complete anonymized image dataset of all measurements and anonymized image data is stored and available at the author's institution. It is part of a larger yet unpublished image collection.
ACKNOWLEDGMENTS
We express our thanks to Tiana Grubešić, MD, radiologist from the Department of Radiology, Clinical Hospital Center Rijeka, Croatia and Michael Janisch, MD, radiology resident at the Division of General Radiology, Department of Radiology, Medical University of Graz, Austria, for performing image quality ratings.
CONFLICT OF INTEREST STATEMENT
None declared.
Institution where the work originated from is the Department of Computer Engineering, Faculty of Engineering, University of Rijeka, Vukovarska 58, Rijeka 51000, Croatia.
REFERENCES
- 1. Delft E, Gelder TGV, Vries R, Vermeulen J, Bloemers FW.. Duration of cast immobilization in distal radial fractures: a systematic review. J Wrist Surg 2019; 8 (5): 430–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. You C, Li G, Zhang Y, et al. CT super-resolution GAN Constrained by the Identical, Residual, and Cycle Learning Ensemble (GAN-CIRCLE). IEEE Trans Med Imaging 2020; 39 (1): 188–203. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Jin CB, Kim H, Liu M, et al. Deep CT to MR synthesis using paired and unpaired data. Sensors (Basel) 2019; 19 (10): 2361. doi: 10.3390/s19102361. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Jin C-B, Kim H, Liu M, et al. DC2Anet: generating lumbar spine MR images from CT scan data based on semi-supervised learning. Appl Sci 2019; 9 (12): 2521. [Google Scholar]
- 5. Tezcan KC, Baumgartner CF, Luechinger R, Pruessmann KP, Konukoglu E.. MR image reconstruction using deep density priors. IEEE Trans Med Imaging 2019; 38 (7): 1633–42. [DOI] [PubMed] [Google Scholar]
- 6. Huang H, Li Z, He R, Sun Z, Tan T.. IntroVAE: introspective variational autoencoders for photographic image synthesis. Adv Neural Inf Process Syst. 2018; 31: 52–63. [Google Scholar]
- 7. Goodfellow IJ, Pouget-Abadie J, Mirza M, et al. Generative adversarial networks. arXiv 2014. eprint: 1406.2661 [Google Scholar]
- 8. Zheng K, Cheng Y, Kang X, Yao H, Tian T.. Conditional introspective variational autoencoder for image synthesis. IEEE Access 2020; 8: 153905–13. [Google Scholar]
- 9. Larsen ABL, Sønderby SK, Larochelle H, Winther O.. Autoencoding beyond pixels using a learned similarity metric. In: Maria Florina B, Kilian QW, eds. Proceedings of the 33rd International Conference on Machine Learning. Proceedings of Machine Learning Research: PMLR; New York, NY: PMLR; 2016: 1558–66. [Google Scholar]
- 10.Han K, Wen H, Shi J, Lu K-H, Zhang Y, Liu Z. Variational autoencoder: An unsupervised model for encoding and decoding fMRI activity in visual cortex. NeuroImage, 2019; 198: 125-–36. doi: 10.1016/j.neuroimage.2019.05.039. [DOI] [PMC free article] [PubMed]
- 11.Zhu J, Park T, Isola P, Efros AA. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. In: 2017 IEEE International Conference on Computer Vision (ICCV); 2017: 2242–51. doi: 10.1109/ICCV.2017.244. [Google Scholar]
- 12. Liang J, Tang YX, Tang YB, Xiao J, Summers RM. Bone suppression on chest radiographs with adversarial learning. In: Hahn HK, Mazurowski MA, eds.. Medical Imaging 2020: Computer-Aided Diagnosis. Vol. 11314. Bellingham, Washington: International Society for Optics and Photonics; 2020: 1131409. [Google Scholar]
- 13. Parsania P, Virparia P.. A comparative analysis of image interpolation algorithms. Int J Adv Res Comput Commun Eng 2016; 5 (1): 29–34. [published Online First: Epub Date]|. [Google Scholar]
- 14. Yi Z, Zhang H, Tan P, Gong M. DualGAN: Unsupervised Dual Learning for Image-to-Image Translation. In: 2017 IEEE International Conference on Computer Vision (ICCV); 2017: 2868–76. doi: 10.1109/ICCV.2017.310. [Google Scholar]
- 15. Zhang R, Pfister T, Li J.. Harmonic unpaired image-to-image translation. arXiv 2019. eprint: 1902.0972 [Google Scholar]
- 16. Arjovsky M, Bottou L.. Towards principled methods for training generative adversarial networks. arXiv 2017. eprint: 1701.04862 [Google Scholar]
- 17. Huang X., Liu MY., Belongie S., Kautz J. (2018) Multimodal Unsupervised Image-to-Image Translation. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y, eds.. Computer Vision – ECCV 2018. ECCV 2018. Lecture Notes in Computer Science. Vol. 11207. Cham: Springer; 2018. doi: 10.1007/978-3-030-01219-9_11. [Google Scholar]
- 18. Taigman Y, Polyak A, Wolf L.. Unsupervised cross-domain image generation. arXiv 2016. eprint: 1611.02200 [Google Scholar]
- 19. Johnson J, Alahi A, Fei-Fei L.. Perceptual losses for real-time style transfer and super-resolution. In: European Conference on Computer Vision; October 11–14, 2016; Cham: Springer. [Google Scholar]
- 20. Ronneberger O, Fischer P, Brox T.. U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention; October 5–9, 2015; Cham: Springer. [Google Scholar]
- 21. Isola P, Zhu J, Zhou T, Efros AA. Image-to-Image Translation with Conditional Adversarial Networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2017: 5967–76. doi: 10.1109/CVPR.2017.632. [Google Scholar]
- 22.Marin-Reyes PA, Lorenzo-Navarro J, Castrillón-Santana M. Comparative study of histogram distance measures for re-identification. 2016. arXiv preprint arXiv:1611.08134.
- 23. Jia W, Zhang H, He X, Wu Q. A Comparison on Histogram Based Image Matching Methods. In: Proceedings of the IEEE International Conference on Video and Signal Based Surveillance (AVSS '06); USA: IEEE Computer Society; 2006: 97. doi: 10.1109/AVSS.2006.5.
- 24. Name M, Lima J, Boff F, Filho D, Falate R.. Histogram comparison using intersection metric applied to digital images analysis. Iberoamerican J Appl Comput 2012; 2: 11–8. [Google Scholar]
- 25. Gagunashvili N. Chi-square tests for comparing weighted histograms. Nucl Instrum Methods Phys Res A 2010; 614 (2): 287–96. [Google Scholar]
- 26. Le Cam L, Lo Yang G.. Asymptotics in Statistics: Some Basic Concepts. New York, NY: Springer New York: Imprint: Springer; 2000. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
A complete anonymized image dataset of all measurements and anonymized image data is stored and available at the author's institution. It is part of a larger yet unpublished image collection.



