Artificial intelligence for human gunshot wound classification

Jerome Cheng; Carl Schmidt; Allecia Wilson; Zixi Wang; Wei Hao; Joshua Pantanowitz; Catherine Morris; Randy Tashjian; Liron Pantanowitz

doi:10.1016/j.jpi.2023.100361

. 2023 Dec 30;15:100361. doi: 10.1016/j.jpi.2023.100361

Artificial intelligence for human gunshot wound classification

Jerome Cheng ^a,^⁎, Carl Schmidt ^a, Allecia Wilson ^a, Zixi Wang ^b, Wei Hao ^b, Joshua Pantanowitz ^c, Catherine Morris ^a, Randy Tashjian ^a, Liron Pantanowitz ^a

PMCID: PMC10792621 PMID: 38234590

Abstract

Certain features are helpful in the identification of gunshot entrance and exit wounds, such as the presence of muzzle imprints, peripheral tears, stippling, bone beveling, and wound border irregularity. Some cases are less straightforward and wounds can thus pose challenges to an emergency room doctor or forensic pathologist. In recent years, deep learning has shown promise in various automated medical image classification tasks.

This study explores the feasibility of using a deep learning model to classify entry and exit gunshot wounds in digital color images. A collection of 2418 images of entrance and exit gunshot wounds were procured. Of these, 2028 entrance and 1314 exit wounds were cropped, focusing on the area around each gunshot wound. A ConvNext Tiny deep learning model was trained using the Fastai deep learning library, with a train/validation split ratio of 70/30, until a maximum validation accuracy of 92.6% was achieved. An additional 415 entrance and 293 exit wound images were collected for the test (holdout) set. The model achieved an accuracy of 87.99%, precision of 83.99%, recall of 87.71%, and F1-score 85.81% on the holdout set. Correctly classified were 88.19% of entrance wounds and 87.71% of exit wounds. The results are comparable to what a forensic pathologist can achieve without other morphologic cues. This study represents one of the first applications of artificial intelligence to the field of forensic pathology. This work demonstrates that deep learning models can discern entrance and exit gunshot wounds in digital images with high accuracy.

Keywords: Artificial intelligence, Deep learning, Convolutional neural network, Human gunshot wound

Introduction

Gunshot wound (GSW) characterization is an important responsibility of emergency room physicians and forensic pathologists that may carry legal implications.¹ When evaluating a GSW, the examining forensic pathologist has to document wound size, shape, site, and location of the wound.² Wound severity is affected by several factors including the type of firearm, bullet construction, mass, velocity, and soft tissue properties.³ The distance of fire, point of entry/exit, and whether the wound was inflicted accidentally or self-inflicted (manner of death) are additional points that have to be determined.

Several features are helpful in the identification of gunshot entrance and exit wounds, such as the presence of muzzle imprints, soot deposition, peripheral tears, gunpowder stippling, abrasion collars, bone beveling, and wound border irregularity. At the point of bullet contact, distant range shots produce an abrasion collar due to detachment of epidermis from deeper skin layers.⁴ The width of abrasion collars are influenced by bullet design, caliber, velocity, angle, and the affected body region.⁵ Entrance wounds are usually round or oval,⁴ but show a stellate appearance when occurring over bony surfaces. Exit wounds tend to be larger, irregularly outlined, and lack gunpowder stippling and soot deposition. Some cases are less straightforward and can pose challenges to an emergency room doctor or forensic pathologist. The heterogeneity of the appearance of these wounds is a problem that is difficult to solve.

Herein, we propose using a deep learning (DL) model trained on thousands of images to assist in the classification of GSWs. To our knowledge, this is one of the first studies using artificial intelligence (AI) to help classify human GSWs. One study using AI on GSWs is a proof-of-concept study using DL and simulated gunshots on piglet carcasses to differentiate contact shots from close-range shots and distant range shots.⁶

In recent years, DL which is a type of AI, has proven to be capable of classifying images from various subject matters with high accuracy, at times even surpassing human performance.⁷ DL refers to neural networks with many layers. Convolutional neural networks (CNN) are a type of DL that are often used to solve computer vision problems such as image classification, detection, and segmentation. CNNs employ a hierarchical set of weights and processing layers to generate data representations that are capable of making predictions from digital images.⁸ Given a sufficiently large dataset, a CNN model can automatically learn to create a set of hierarchical features (e.g., curves, colors, shapes) to represent data, oftentimes enabling it to classify images with accuracies comparable to a trained pathologist.⁹ DL has been widely used for histopathological cancer identification/grading, mitosis detection, automated immunohistochemistry stain scoring, cell counting, and other types of image analysis tasks. Several DL-based tools have already attained FDA approval and/or CE Marking, including software for classifying/grading prostate cancer.¹⁰

DL, and CNNs in particular, have been around for several decades, but only started gaining widespread attention and usage over the past decade. The recent surge in popularity may be attributed to a number of factors, including the availability of faster computers, faster graphical processing units with more memory, large public image datasets, freely available DL architectures, and open-source tools for training DL models. Even though DL-based tools can be highly accurate, these algorithms do have some flaws, including the necessity for large amounts of training data, susceptibility to adversarial attacks,¹¹ and lack of explainability, which led to its “black box” designation.

Materials and methods

Data preparation

A collection of 2418 digital color images (jpeg format) of entrance and exit GSWs were procured from our institution’s forensic archive. Of these, 2028 entrance and 1314 exit wounds were cropped, focusing on the area around each GSW. Some images had multiple entry/exit wounds, which led to the number of cropped images exceeding the initial number of images. Images were resized to 300 × 300 pixels in largest dimension, maintaining their aspect ratio, and augmented into 224 × 224 pixel images with varying combinations of horizontal flip, vertical flip, zoom, rotation, warp, and brightness. Several DL architectures were trained using Fastai and Jupyter Notebook on a system running Microsoft Windows 10, with a train/validation split ratio of 70/30 for 50 epochs to determine which architecture worked best with our dataset. Fastai is an open-source Pytorch-based DL framework that can provide state-of-the-art results using various DL architectures with relatively few lines of code.¹² Computer hardware specifications included an Intel I9-9900KF CPU, 64 GB of RAM, and an RTX 3090 GPU. The ConvNext Tiny DL model performed best in the initial trial (attaining a maximum accuracy of 88.5% after 50 epochs), and the model was further trained until a maximum validation accuracy of 92.6% was achieved. Other models explored in the initial trial included popular architectures such as ResNet50, ConvNext Large, EfficientNetV2S, and MobileNetV3Large. After 50 epochs, these models achieved maximum validation accuracies of 87.5%, 87.5%, 86.9%, and 83.8%, respectively. An additional 415 entrance and 293 exit wound images were collected for the test (holdout) set. Predictions were made on the holdout set, and 100 images with the highest error rates were anonymized and distributed to 2 experienced forensic pathologists for review.

Statistical analysis

We compared the classification rates for entrance wounds, exit wounds, and overall performance between 2 pathologists versus AI, using 100 images. To analyze the paired data between the pathologists and AI, we utilized McNemar’s test, which was applied to 3 two-dimensional contingency tables, resulting in 3 p-values corresponding to entrance wounds, exit wounds, and overall assessment. In addition, we plotted the receiver operating characteristic curve and calculated the area under the curve (AUC) for the holdout set using the R package “pROC”.¹³ All statistical analyses were conducted in R software (version 4.1.2).

Results

The ConvNext Tiny model achieved an accuracy of 87.99% (95% CI 82%–94%), precision of 83.99% (95% CI 73.17%–93.33%), recall of 87.71% (95% CI 77.78%–95.83%), specificity of 88.19% (95% CI 80%–95.08%), F1-score of 85.81% (95% CI 77.65%–92.47%), and AUC of 0.946 (95% CI 0.931–0.962, Fig. 3) on the holdout set. Correctly classified were 88.19% of entrance wounds and 87.71% of exit wounds. There were 366 true entrances, 257 true exits, 49 false exits, and 36 false entrances (Table 2). The results are comparable to what a forensic pathologist can achieve without other cues.¹⁴

Fig. 3 — ROC curve based on AI prediction probabilities on holdout set. It had an AUC of 0.946 (95% CI 0.931–0.962).

Table 2.

Prediction accuracy of AI on the holdout set (n = 708). AI prediction is considered to be “entrance” if the prediction probability for “entrance” is >= 0.5, otherwise it is “exit”.

	AI
Gunshot wounds	Correct	Incorrect
Entrance (n = 415)	366 (88%)	49 (12%)
Exit (n = 293)	257 (88%)	36 (12%)
Total (n = 708)	623	85

Open in a new tab

The 100 images for which the AI model had the most difficulty with classification were presented to 2 experienced forensic pathologists to assess whether they also found these images challenging. These challenging images consisted of 54 entrance wounds and 46 exit wounds. In this subsequent data subset, the AI model correctly classified 5 (9%) entrance wounds and 10 (22%) exit wounds, while our pathologists were able to correctly classify 43 (80%) entrance and 13 (28%) exit wounds. Overall, our pathologists outperformed AI (p < .0001), with a misclassification rate of 44% of the images for the pathologists compared to 85% for the AI (Table 1). Interestingly, the prediction accuracy between pathologists (72% misclassified) and AI (78% misclassified) for exit wounds were not found to be significantly different (p = .58).

Table 1.

Prediction accuracy comparison between pathologists and AI on top-100 loss images. p-values were obtained from the McNemar test.

	Forensic pathologists		AI		p-value
Gunshot wound	Correct	Incorrect	Correct	Incorrect	p-value
Entrance (n = 54)	43 (80%)	11 (20%)	5 (9%)	49 (91%)	<.0001*
Exit (n = 46)	13 (28%)	33 (72%)	10 (22%)	36 (78%)	.58
Total (n = 100)	56 (56%)	44 (44%)	15 (15%)	85 (85%)	<.0001*

Entrance GSW		AI
		Correct	Incorrect

Pathologists	Correct	4	39
	Incorrect	1	10

Exit GSW		AI
		Correct	Incorrect

Pathologists	Correct	5	8
	Incorrect	5	28

Total		AI
		Correct	Incorrect

Pathologists	Correct	9	47
Pathologists	Incorrect	6	38

Open in a new tab

Discussion

Employing several comparative DL models, we were able to develop an AI-based model to correctly distinguish entrance from exit GSWs represented in digital color forensic images with high performance accuracy. Of the 100 images with the highest error rates (including 85 misclassified pictures), some were obvious entrance wounds (Fig. 1), while others were more ambiguous (Fig. 2). One case that was misclassified by our AI-based model had stippling around the entrance wound, a characteristic never present in exit wounds. One hypothetical scenario in which the AI-based model may find difficulty in the determination of entrance versus exit wounds may be in cases with multiple gunshot wounds with at least 1 retained projectile, as confirmed by post-mortem radiographs. In this situation, the AI-based model may not reliably resolve which of the wounds are entrance wounds and which are exit wounds. Another limitation of the system is the fact that image analysis was confined to 2 dimensions. Furthermore, forensic pathologists often manipulate tissues and probe GSWs with instruments in order to supplement their understanding of wound characteristics, neither of which an AI-based model is capable of replicating. Clearly, these examples indicate that there are circumstances under which wound classification cannot be performed by an AI-based model with absolute accuracy.

Fig. 1 — Entry wounds from holdout set misclassified by AI (A–D). Abrasion collars are a characteristic feature.

Fig. 2 — Exit wounds from holdout set misclassified by AI (A–D). Some of these wounds are difficult to classify without external cues.

Adding to the challenge, there are GSWs with atypical presentations that manifest under certain circumstances. For instance, entrance wounds that result from bullets that first strike an intermediary target (e.g., victims who are shot in motor vehicles) may exhibit an atypical appearance. Moreover, wounds in skin folds of obese individuals, especially when they are in contact with each other, can be a diagnostic problem. The presence of soot deposition around entrance wound margins may also be absent when a bullet passes through clothing.¹⁵ In addition, a close contact shotgun wound to the head can produce a stellate exit with radial tears.¹⁶ Cranial close range entrance wounds with radial skin tears may even resemble an exit wound.¹⁷

Challenges/Limitations

In general, DL models benefit from more data, improving prediction accuracy as more training samples are added.¹⁸ However, image collection and preparation is very time-consuming, and thus we limited our data to 2000+ images due to effort constraints. For the images chosen, different photographers had taken these pictures which were of varying quality, leading to some underexposed pictures that may have affected model performance. Focus, exposure quality, daylight versus tungsten light, and other factors in the digital images utilized had a role in the final picture quality. Our dataset contained more entrance wounds than there were exit wounds, and it is plausible that there were not enough examples for every condition. Radial tears in contact entrance wounds in skin overlying bone (like the skull) may impart a stellate appearance, which our DL model may have difficulty in differentiating from stellate-appearing exit wounds without being trained with a larger number of similar samples.

Conclusions

This study represents one of the first applications of AI to the field of forensic pathology. This work demonstrates that it is feasible to train a DL model to classify entrance and exit GSWs in digital images with reasonable accuracy. In challenging situations where AI is unable to reliably classify a GSW, additional information, including the circumstances of the case, radiographs, and the ability to manipulate the body itself are essential to derive a more definite assessment.

Declaration of Competing Interest

The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:

L. Pantanowitz is on the scientific advisory board for Ibex and NTP, serves as a consultant for AIxMed and Hamamatsu and is a co-owner of Placenta AI and LeanAP. Editorial board of Journal of Pathology Informatics - L.P., J.C. If there are other authors, they declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

1.Denton J.S., Segovia A., Filkins J.A. Practical pathology of gunshot wounds. Arch Pathol Lab Med. 2006;130:1283–1289. doi: 10.5858/2006-130-1283-PPOGW. [DOI] [PubMed] [Google Scholar]
2.Shrestha R., Kanchan T., Krishan K. StatPearls. StatPearls Publishing; 2022. Gunshot wounds forensic pathology. [PubMed] [Google Scholar]
3.White K.M. Injuring mechanisms of gunshot wounds. Crit Care Nurs Clin North Am. 1989;1:97–103. [PubMed] [Google Scholar]
4.Grosse Perdekamp M., Vennemann B., Mattern D., Serr A., Pollak S. Tissue defect at the gunshot entrance wound: what happens to the skin? Int J Legal Med. 2005;119:217–222. doi: 10.1007/s00414-005-0542-z. [DOI] [PubMed] [Google Scholar]
5.Pircher R., et al. The influence of the bullet shape on the width of abrasion collars and the size of gunshot entrance holes. Int J Legal Med. 2017;131:441–445. doi: 10.1007/s00414-016-1501-6. [DOI] [PubMed] [Google Scholar]
6.Oura P., Junno A., Junno J.-A. Deep learning in forensic gunshot wound interpretation-a proof-of-concept study. Int J Legal Med. 2021;135:2101–2106. doi: 10.1007/s00414-021-02566-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.He K., Zhang X., Ren S., Sun J. 2015. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. Preprint at http://arxiv.org/abs/1502.01852. [Google Scholar]
8.LeCun Y., Bengio Y., Hinton G. Deep learning. Nature. 2015;521:436–444. doi: 10.1038/nature14539. [DOI] [PubMed] [Google Scholar]
9.van der Laak J., Litjens G., Ciompi F. Deep learning in histopathology: the path to the clinic. Nat Med. 2021;27:775–784. doi: 10.1038/s41591-021-01343-4. [DOI] [PubMed] [Google Scholar]
10.Meroueh C., Chen Z.E. Artificial intelligence in anatomical pathology: building a strong foundation for precision medicine. Hum Pathol. 2023;132:31–38. doi: 10.1016/j.humpath.2022.07.008. [DOI] [PubMed] [Google Scholar]
11.Yuan X., He P., Zhu Q., Li X. 2018. Adversarial Examples: Attacks and Defenses for Deep Learning. Preprint at http://arxiv.org/abs/1712.07107. [DOI] [PubMed] [Google Scholar]
12.Howard J., Gugger S. fastai: a layered API for deep learning. Information. 2020;11:108. [Google Scholar]
13.Robin X., et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinform. 2011;12:77. doi: 10.1186/1471-2105-12-77. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Heninger M. Concordance rate for the identification of distant entrance gunshot wounds of the back by experienced forensic pathologists examining only images of autopsies. J Forensic Sci. 2016;61:352–360. doi: 10.1111/1556-4029.13011. [DOI] [PubMed] [Google Scholar]
15.Haag L.C., Jason A. Misleading entry wounds from atypical bullet behavior. Am J Forensic Med Pathol. 2022;43:174–182. doi: 10.1097/PAF.0000000000000727. [DOI] [PubMed] [Google Scholar]
16.Kunz S.N. An unusual exit wound as a result of a shotgun suicide to the head. Forensic Sci Int. 2017;275:e1–e5. doi: 10.1016/j.forsciint.2017.03.015. [DOI] [PubMed] [Google Scholar]
17.Lantz P.E. An atypical, indeterminate-range, cranial gunshot wound of entrance resembling an exit wound. Am J Forensic Med Pathol. 1994;15:5–9. doi: 10.1097/00000433-199403000-00002. [DOI] [PubMed] [Google Scholar]
18.Anand D., et al. Deep learning to estimate human epidermal growth factor receptor 2 status from hematoxylin and eosin-stained breast tissue images. J Pathol Inform. 2020;11:19. doi: 10.4103/jpi.jpi_10_20. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0005] 1.Denton J.S., Segovia A., Filkins J.A. Practical pathology of gunshot wounds. Arch Pathol Lab Med. 2006;130:1283–1289. doi: 10.5858/2006-130-1283-PPOGW. [DOI] [PubMed] [Google Scholar]

[bb0010] 2.Shrestha R., Kanchan T., Krishan K. StatPearls. StatPearls Publishing; 2022. Gunshot wounds forensic pathology. [PubMed] [Google Scholar]

[bb0015] 3.White K.M. Injuring mechanisms of gunshot wounds. Crit Care Nurs Clin North Am. 1989;1:97–103. [PubMed] [Google Scholar]

[bb0020] 4.Grosse Perdekamp M., Vennemann B., Mattern D., Serr A., Pollak S. Tissue defect at the gunshot entrance wound: what happens to the skin? Int J Legal Med. 2005;119:217–222. doi: 10.1007/s00414-005-0542-z. [DOI] [PubMed] [Google Scholar]

[bb0025] 5.Pircher R., et al. The influence of the bullet shape on the width of abrasion collars and the size of gunshot entrance holes. Int J Legal Med. 2017;131:441–445. doi: 10.1007/s00414-016-1501-6. [DOI] [PubMed] [Google Scholar]

[bb0030] 6.Oura P., Junno A., Junno J.-A. Deep learning in forensic gunshot wound interpretation-a proof-of-concept study. Int J Legal Med. 2021;135:2101–2106. doi: 10.1007/s00414-021-02566-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0035] 7.He K., Zhang X., Ren S., Sun J. 2015. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. Preprint at http://arxiv.org/abs/1502.01852. [Google Scholar]

[bb0040] 8.LeCun Y., Bengio Y., Hinton G. Deep learning. Nature. 2015;521:436–444. doi: 10.1038/nature14539. [DOI] [PubMed] [Google Scholar]

[bb0045] 9.van der Laak J., Litjens G., Ciompi F. Deep learning in histopathology: the path to the clinic. Nat Med. 2021;27:775–784. doi: 10.1038/s41591-021-01343-4. [DOI] [PubMed] [Google Scholar]

[bb0050] 10.Meroueh C., Chen Z.E. Artificial intelligence in anatomical pathology: building a strong foundation for precision medicine. Hum Pathol. 2023;132:31–38. doi: 10.1016/j.humpath.2022.07.008. [DOI] [PubMed] [Google Scholar]

[bb0055] 11.Yuan X., He P., Zhu Q., Li X. 2018. Adversarial Examples: Attacks and Defenses for Deep Learning. Preprint at http://arxiv.org/abs/1712.07107. [DOI] [PubMed] [Google Scholar]

[bb0060] 12.Howard J., Gugger S. fastai: a layered API for deep learning. Information. 2020;11:108. [Google Scholar]

[bb0065] 13.Robin X., et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinform. 2011;12:77. doi: 10.1186/1471-2105-12-77. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0070] 14.Heninger M. Concordance rate for the identification of distant entrance gunshot wounds of the back by experienced forensic pathologists examining only images of autopsies. J Forensic Sci. 2016;61:352–360. doi: 10.1111/1556-4029.13011. [DOI] [PubMed] [Google Scholar]

[bb0075] 15.Haag L.C., Jason A. Misleading entry wounds from atypical bullet behavior. Am J Forensic Med Pathol. 2022;43:174–182. doi: 10.1097/PAF.0000000000000727. [DOI] [PubMed] [Google Scholar]

[bb0080] 16.Kunz S.N. An unusual exit wound as a result of a shotgun suicide to the head. Forensic Sci Int. 2017;275:e1–e5. doi: 10.1016/j.forsciint.2017.03.015. [DOI] [PubMed] [Google Scholar]

[bb0085] 17.Lantz P.E. An atypical, indeterminate-range, cranial gunshot wound of entrance resembling an exit wound. Am J Forensic Med Pathol. 1994;15:5–9. doi: 10.1097/00000433-199403000-00002. [DOI] [PubMed] [Google Scholar]

[bb0090] 18.Anand D., et al. Deep learning to estimate human epidermal growth factor receptor 2 status from hematoxylin and eosin-stained breast tissue images. J Pathol Inform. 2020;11:19. doi: 10.4103/jpi.jpi_10_20. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Artificial intelligence for human gunshot wound classification

Jerome Cheng

Carl Schmidt

Allecia Wilson

Zixi Wang

Wei Hao

Joshua Pantanowitz

Catherine Morris

Randy Tashjian

Liron Pantanowitz

Abstract

Introduction