Abstract
We describe and validate a feature-based system for calculation of likelihood ratios from 3D digital images of fired cartridge cases. The system includes a database of 3D digital images of the bases of 10 cartridges fired per firearm from approximately 300 firearms of the same class (semi-automatic pistols that fire 9 mm diameter centre-fire Luger-type ammunition, and that have hemispherical firing pins and parallel breech-face marks). The images were captured using Evofinder®, an imaging system that is commonly used by operational forensic laboratories. A key component of the research reported is the comparison of different feature-extraction methods. Feature sets compared include those previously proposed in the literature, plus Zernike-moment based features. Comparisons are also made of using feature sets extracted from the firing-pin impression, from the breech-face region, and from the whole region of interest (firing-pin impression + breech-face region + flowback if present). Likelihood ratios are calculated using a statistical modelling pipeline that is standard in forensic voice comparison. Validation is conducted and results are assessed using validation procedures and validation metrics and graphics that are standard in forensic voice comparison.
Keywords: Firearm, Cartridge case, Likelihood ratio, Feature, Calibration, Validation
1. Introduction
1.1. Outline
When firearms are fired at a crime scene and cartridge cases are ejected, these fired cartridge cases may later be recovered. Forensic practitioners may then compare two fired cartridge cases recovered from the crime scene with each other – a comparison of a fired cartridge case which bears markings of questioned source with another fired cartridge case which bears markings of questioned source (hereinafter we refer to this as “Scenario 1”). Forensic practitioners may also compare a fired cartridge case recovered from the crime scene with cartridge cases that they fire from a firearm seized from a suspect – a comparison of a fired cartridge case which bears markings of questioned source with fired cartridge cases which bear markings of known source (hereinafter we refer to this as “Scenario 2”).
The evaluation in Scenario 1 could be conducted for investigative purposes, but could also be used for evidential purposes if no relevant firearms are available for comparison but the question of how many firearms were fired during the commission of a crime is relevant for legal decision making.
For simplicity, in the present paper we assume exactly two recovered cartridge cases in Scenario 1 and exactly one recovered cartridge case in Scenario 2. Real casework may involve larger numbers of recovered cartridge cases, but these can be dealt with via expansion or repetition of the methods described in the present paper.
For brevity, we will use the terms “questioned-source cartridge case” and “known-source cartridge case” as abbreviations for “cartridge case bearing marks of questioned source” and “cartridge case bearing marks of known source” respectively.
In the remainder of the introduction:
-
•
We describe the anatomy of a fired cartridge case and the processes by which firearms leave marks on cartridge cases (§1.2).
-
•
We describe current casework practice for comparison of fired cartridge cases (§1.3).
-
•
We provide a summary of published research on feature-extraction methods and statistical-modelling methods that have previously been applied to forensic comparison of fired cartridge cases (§1.4).
In the remainder of the paper:
-
•
We describe the hypotheses, including the specification of the relevant population, that we have adopted for calculating likelihood ratios in the context of the present research (§2).
-
•We describe a feature-based system that we have developed for calculation of likelihood ratios from images of fired cartridge cases (§3). The system includes:
-
oa database of 3D digital images of the bases of fired cartridge cases (§3.2)
-
•preprocessing of images (§3.3)
-
•feature-extraction methods (§3.4)
-
•a statistical modelling pipeline that calculates likelihood ratios (§3.5)
-
o
-
•
We describe validation procedures (§4), and present and discuss the validation results (§5 and §6).
The focus of the present paper is on comparing the performance of different feature-extraction methods. The best-performing feature-extraction method will be used in planned future research using a larger database and Deep Neural Network (DNN) embeddings.
The research reported in the present paper is part of a wider programme of research which is outlined in Morrison [1].
1.2. Anatomy of a fired cartridge case
Fig. 1 shows an example of an image of the base of a fired cartridge case. The head-stamp region includes text indicating the manufacturer and calibre of the cartridge case. We assume that this information is factual, and that it narrows the “class” of the cartridge case without the need for interpretation. The other regions together constitute the region of interest and each of these regions is italicized on its first mention in the paragraph below.
Fig. 1.
Graphical representations of an example of the base of a fired cartridge case (9 mm diameter Luger-type ammunition). (a) Perpendicular view. (b) Oblique view with z scale exaggerated by a factor of 5.
An unfired cartridge of ammunition consists of a cartridge case, a bullet, and explosives. The cartridge case is a metal tube that is sealed at its base and is plugged at the other end (its mouth) by the bullet. Between the base of the cartridge case and the bullet are explosives. A cartridge is loaded into the chamber of a firearm. When the firearm is fired, the firing pin strikes the primer cup on the base of the cartridge case. This deforms the primer cup creating a concave firing-pin impression.1 This kinetic action initiates an explosion within the cartridge case which forces the bullet forward out of the mouth of the cartridge case and along the barrel of the firearm. The explosion also forces the cartridge case backward until its base impacts the breech of the firearm.2 This creates an impression of the breech face on the base of the cartridge case. The region of the base of the cartridge case where this impression is made is called the breech-face region. The explosion can also push outward the area around the firing-pin impression leading to convex deformation known as flowback. After the firearm has been fired, the cartridge case is (manually or automatically) ejected so that a new unfired cartridge can be loaded into the chamber. Typically, ejected cartridge cases fall to the ground, and they can potentially be recovered at a later time.
Breech faces are not perfectly smooth. They have irregularities due to the manufacturing process and potentially due to later wear or damage. These irregularities vary across firearms. A breech-face impression on the base of a fired cartridge case will reflect the irregularities of a breech face. The transferred patterns of irregularities can include, but are not limited to, parallel series of peaks and troughs. Differences in the irregularities of the breech faces of different firearms will cause variability in the breech-face impressions on cartridge cases fired from different firearms. Differences in the transfer of the irregularities of a breech face to cartridge cases will cause variability in the breech-face impressions on cartridge cases fired from the same firearm. Similarly, the location, shape, and surface details of firing-pin impressions can vary both across fires from the same firearm and across fires from different firearms.
Considering a firearm as the source of breech-face and firing-pin impressions, inferences with respect to which firearm fired a cartridge case can be drawn if the between-source variability in breech-face and firing-pin impressions is greater than their within-source variability.3
1.3. Current casework practice
For reviews of current casework practice in firearm examination see Bolton-King [2] and Nichols [3]. In current widespread practice, the analysis is a human-perception process and the interpretation of the extracted information is a subjective-judgement process. The forensic practitioner visually compares fired cartridge cases, viewing them side by side through a comparison microscope.4 Properties that practitioners report taking into consideration include the position and shape of the firing-pin impression, and the heights, widths, and distances between parallel peaks and troughs on the firing-pin impression and on the breech-face region (Tobin & Blau [4]; Tai & Eddy [5]). Current widespread practice is to report the conclusion as a categorical decision, i.e., as “identification”, “inconclusive”, or “elimination” (or as “unsuitable for analysis”). Existing validation studies of practitioner performance have tended to use a small number of test trials, and have seldom reflected real casework conditions (Smith et al. [6]; Mattijssen et al. [7], [8]; Scurich et al. [9]).
For Scenario 2, practitioners typically fire 3 cartridges from the firearm of interest and compare the fired cartridge cases with the cartridge case recovered from the crime scene.
Images of fired cartridge cases for comparison with an image of a questioned-source cartridge case recovered from a crime scene may be selected via an automated database search. The automated search returns a set of candidates for comparison with the questioned-source cartridge case, i.e., cartridge-case images in the database that the automated system determines to be the most similar to the questioned-source cartridge-case image. Thereafter, the comparison between the questioned-source image and the known-source images in the candidate set becomes a variant of Scenario 1 or Scenario 2: the evaluation is conducted using the human-perception and human-judgement processes described above.
In a survey of practitioners presented in Scurich et al. [9], ∼7% of respondents reported that a typical fired-cartridge-case comparison took less than 30 min, ∼24% that it took 30–60 min, ∼28% that it took 1–2 h, ∼26% that it took 2–4 h, and ∼15% that it took more than 4 h.
1.4. Previous work using data, quantitative measurements, and statistical models
1.4.1. Introduction
In this subsection, we summarize published research on quantitative-measurement and statistical-modelling methods that have been applied in forensic comparison of fired cartridge cases. We first describe existing databases of images of the bases of fired cartridge cases (§1.4.2). We then summarize feature-extraction methods (§1.4.3), and statistical models that have been applied to features and to similarity scores (§1.4.4). Finally, we summarize the results of research on practitioners’ attitudes toward the use of statistical models (§1.4.5).
1.4.2. Databases
Data of interest consist of 2D or 3D digital images of cartridge-case bases. 2D photographic images capture reflected light. 3D images capture surface topography, including depth information. In the present paper we focus on 3D images. There are several commercially marketed 3D imaging systems. The two most commonly used in operational forensic laboratories are Evofinder® and IBIS®. Research using such systems has the advantage of potentially being more quickly applicable to casework.
Published research on statistical models for comparison of fired cartridge cases has made use of training and validation datasets that are relatively small. Some existing datasets consist of a large number of fires from a small number of firearms, e.g., 10–60 test-fires from each of 1–5 firearms (Thumwarin [10]; Liong et al. [11]; Ott et al. [12]; Addinall et al. [13]), and others consist of a small number of fires from a somewhat larger number of firearms, e.g., 1–4 test-fires from each of 10–90 firearms (Xin et al. [14]; Legrá et al. [15]; Fadul et al. [16]).5 In addition, only a subset of the datasets used in published research have themselves been published and made available to other researchers and practitioners. Published datasets include those in the NIST Ballistics Toolmark Research Database (NBTRD).6 Some of the more commonly used datasets are described in: Lightstone [19]; LaPorte [20]; Fadul et al. [16].
In order to train a forensic-evaluation system that outputs likelihood ratios, one has to model both within-source and between-source variability. In order to do this, a dataset would be needed that includes a relatively large number of fires from each of a relatively large number of firearms of the same class. Datasets with a large number of firearms consisting of a small number from each of multiple classes would not be suitable for addressing “individualization”, as opposed to “class”, questions. To our knowledge, there are no existing datasets accessible for research purposes that contain images of a sufficient number of cartridge cases fired from each of a sufficient number of firearms of the same class to satisfy our requirements for training and validating a likelihood-ratio system.
1.4.3. Feature extraction
In published research, features have typically been extracted from the firing-pin impression and from the breech-face region. Flowback has usually been excluded from analysis (Ott et al., [12]; Song et al., [21]). Many features have been based on quantifications of what forensic practitioners report they pay attention to (see §1.3), but others have been based on functions fitted to image data without regard for interpretability of those features by humans. We will refer to the former as “human-inspired features” and the latter as “functional features”.
Human-inspired features that have been extracted from firing-pin impressions include those based on the impression's location (Legrá et al. [15]), overall shape (Zhou et al. [22]; Li [23]; Thumwarin et al. [10]), and surface texture (Legrá et al. [15]). Human-inspired features that have been extracted from the breech-face region include those based on low-frequency undulations of parallel peaks and troughs (Gambino et al. [23]; Petraco et al. [24]), in the literature this is termed “waviness”, and higher-frequency irregularities/residuals in those undulations (Petraco et al. [25]; Pan et al. [26]), in the literature this is termed “roughness”. Most of these features have been extracted from manually-selected parts of the firing-pin impression or of the breech-face region.
Functional features that have been extracted from firing-pin impressions include values of central geometric moments (Ghani et al., [27]) and of Legendre moments (Chuan et al., [28]). From the whole cartridge-case base (including the headstamp region), Leng & Huang [29] extracted as features the values of circle-moment invariants (a modified version of central moments). From the whole region of interest (firing-pin impression + any flowback + breech-face region), Thumwarin et al. [10] extracted as features the magnitude-coefficient values from Fourier series fitted independently to each member of a set of concentric circles.
1.4.4. Statistical models
Statistical models applied in the published research have primarily been classification models rather then likelihood-ratio models. These classification models have included k nearest neighbors (Fischer & Vielhauer [30], [31]; Morris et al. [32]), linear discriminant analysis (Thumwarin et al. [10]; Ghani et al. [27]; Chuan et al. [28]), support vector machines (Zhou et al., [22]), bagged decision trees (Morris et al. [32]), and neural networks (Li [33]; Leng & Huang [29]; Morris et al. [32]; Ghani et al. [34]; Giudice et al. [35]; Razak et al. [36]).
Other statistical models used for classification or for database search have skipped extraction of features and have been based on similarity scores calculated as the correlation between pairs of digital images, i.e., the correlation between the z values (the intensities for 2D images, or the heights for 3D images) at the corresponding x and y points of the two images. Similarity scores are calculated for pairs of cartridge cases known to come from the same source and for pairs of cartridge cases known to come from different sources, and statistical models are fitted to these two sets of scores (Roth et al. [37]; Song [38]; Ott et al. [12]; Tai & Eddy [[5], [39]]; Zhang [40]). This approach has been applied to the whole of the firing-pin impression or the whole of the breech-face region (Song et al. [41]; Roth et al. [37]). Prior to calculating the correlation coefficient, the firing-pin impressions or breech-face regions from the two cartridge cases must be registered (rotated and aligned) relative to each other or to a common target. Rather than calculating the correlation over the whole of the firing-pin impression or the whole of the breech-face region, a commonly used approach is “congruent matching cells” (CMC) which calculates correlations over smaller areas which are called “cells”. The cells are, for example, squares of predetermined size defined by a grid superimposed on the image. Each cell from the questioned-source image is independently rotated and aligned relative to the known-source image in order to find the cell on the latter that is maximally correlated with the former.7 If the maximum correlation coefficient achieved exceeds a predefined threshold, these are designated CMCs. The number of CMCs between a pair of fired cartridge cases can be used as a similarity score. Variants of the CMC approach are described in: Zhang et al. [42], [43]; Chen et al. [44]; Tong et al. [45], [46].
To our knowledge, there is no published research describing calculation of likelihood ratios using statistical models applied to features separately extracted from each cartridge case, but there are a number of papers that describe calculation of likelihood ratios based on similarity scores. The most commonly used similarity score has been a correlation coefficient between pairs of digital images, calculated over the whole of or selected portions of the firing-pin impression or of the breech-face region (Riva & Champod [47]; Dong et al. [48]; Mattijssen et al. [7]; Riva et al. [49]). Other similarity scores used have been based on Euclidian distance between pairs of digital images, and on instantaneous angles on the surfaces of pairs of 3D images (Riva & Champod [47]; Riva et al. [49]). The most commonly used models have fitted kernel density distributions (Riva & Champod [47]; Dong et al. [48]; Mattijssen et al. [7]; Riva et al. [49]). Song et al. [50] used counts of the number of CMCs as similarity scores and fitted beta-binomial models to the count data. Similarity scores, however, do not take account of typicality with respect to the relevant population, and are therefore not an appropriate basis for calculating meaningful likelihood ratios in a forensic context (Morrison & Enzinger [51]; Neumann & Ausdemore [52]; Neumann et al. [53]). In the present paper we will therefore describe and validate a feature-based system for calculation of likelihood ratios.
1.4.5. Practitioners’ attitudes toward the use of statistical models
In a survey of practitioners presented in Scurich et al. [9], some respondents had skeptical (or even hostile) attitudes toward the use of statistical models for comparison of bullets and comparison of fired cartridge cases, but others had more positive attitudes. One of the respondents with a more positive attitude emphasized the need for developers of statistical models to have a thorough understanding of firearms examination, and another emphasized the need for improved performance and for larger databases.
2. Hypotheses and relevant population
2.1. Introduction
In this section, we restate the two casework scenarios of interest, and state the hypotheses that we have adopted with respect to each of these scenarios, including specifying the relevant population. For both scenarios, the hypotheses define a common-source question.8
2.2. Scenario 1
One or more firearms are fired at a crime scene and the cartridge cases are ejected. Crime-scene investigators later recover two fired cartridge cases. A forensic practitioner compares the two questioned-source cartridge cases with one another and draws an inference with respect to whether they were fired by the same firearm or not.
: The two cartridge cases were fired by the same firearm.
: The two cartridge cases were fired by different firearms from the same population.
2.3. Scenario 2
A firearm is fired at a crime scene and the cartridge case is ejected. Crime-scene investigators later recover the fired cartridge case. Police investigators seize a firearm from a suspect. A forensic practitioner fires multiple cartridges from the seized firearm and collects the ejected cartridge cases. The forensic practitioner then compares the fired cartridge case recovered from the crime scene (the questioned-source cartridge case) with the cartridge cases fired from the suspect's firearm (the known-source cartridge cases) and draws an inference with respect to whether the questioned-source and known-source cartridge cases were fired by the same firearm or not.
: The cartridge case bearing marks of questioned source and the multiple cartridge cases bearing marks of a single known source were fired by the same firearm.
: The cartridge case bearing marks of questioned source and the multiple cartridge cases bearing marks of a single known source were fired by different firearms from the same population.
We will test two versions of Scenario 2, one in which the practitioner fires 3 cartridges from the seized firearm, and one in which they fire 9 cartridges.
2.4. Relevant population
In casework, the practitioner would first examine the questioned-source cartridge case in order to assess the class of firearms from which the cartridge case may have been fired. For the purposes of the research reported in the present paper, the relevant population of firearms that we have adopted is semi-automatic pistols that fire 9 mm diameter centre-fire Luger-type ammunition, and that have hemispherical firing pins and parallel breech-face marks. Examples of firearms in this class are Browning Hi-Power, CZ 75, Beretta 92FS, and Ruger P85. This particular class was chosen as the relevant population for the present research because it is commonly encountered in casework [55].9
The evaluation of the class of the firearm is generally considered to be the easiest step in the forensic comparison of fired cartridge cases due to gross differences in geometric form between classes (Bolton-King [2]; Nichols [3]). The present paper is not concerned with evaluation of class-level hypotheses.
3. Fired-cartridge-case-comparison system
3.1. Introduction
In this section, we describe the system we have developed for feature-based calculation of likelihood ratios from images of fired cartridge cases. First, we describe the construction of a database of 3D images of fired cartridge cases that was used for training and validating the algorithmic stages of the system (§3.2), then we describe the algorithmic stages of the system (§3.3–§3.5).
The image-preprocessing, feature-extraction, and statistical-modelling stages of the system are outlined in Fig. 2. In the initial stages, information from a known-source cartridge case and information from a questioned-source cartridge case are processed in parallel. In the final stages, the known-source and questioned-source information are combined.
Fig. 2.
Schematic of the feature-extraction and statistical-modelling stages of the system. Abbreviations: k = known source; q = questioned source; LR = likelihood ratio.
In the first stage, images are preprocessed prior to feature extraction (§3.3). In the next stage, feature vectors are extracted from the images. In §3.4, we provide details of the multiple feature-extraction methods that we have tested. These methods include those that have previously been proposed and applied in the research literature (see §1.4.3), plus an additional method (Zernike moments) that a priori we expected to be effective.
The last three stages in the system are: dimension reduction, calculation of uncalibrated likelihood ratios, and calibration. In §3.5, we describe the statistical models used in each of these stages. The use of this statistical modelling pipeline is standard for backend modelling in state-of-the-art forensic-voice-comparison systems (Morrison et al. [56], [57]; Weber et al. [58]).
Matlab® code implementing the algorithms described in §3.3–§3.5 is available from: https://forensic-data-science.net/firearms/.
3.2. Database
The data for the present research were taken from the E3 Database of Fired Cartridge Cases (release 1), that we built as part of the present research. This database is available from NBTRD.10 A link to the database is also provided at: https://forensic-data-science.net/firearms/.
This database consists of 3D images of the bases of cartridge cases fired from firearms that were in the possession of a number of operational forensic laboratories, law-enforcement agencies, military units, and private individuals in Barbados, Canada, France, Germany, UK, and USA. The cartridges used were taken from whatever each provider had available, with the condition that they have brass primer cups. 10 cartridge cases were fired from each firearm (on occasion, one or more of these were missing). The original aim was to collect 3D images of the bases of cartridge cases fired from 1000 firearms, but progress toward this target was slowed by the COVID 19 pandemic. We plan to continue building the database and, in the future, release additional data. The research reported in the present paper makes use of data from cartridge cases fired from 297 firearms. This was the number available after excluding any firearms for which we received fewer than 8 fired cartridge cases.
The bases of the fired cartridge cases were digitally imaged using Evofinder® (software version 6.6.1.17), which uses a mixture of photometric stereo imaging and focus variation to capture 3D surface topography. The base of each cartridge case was digitally imaged, and the resulting data were exported as a matrix of values in x3p format11 with a resolution of 280 samples per mm in each of the direction and the direction (3.6 μm between samples). The resolution in the direction was able to capture differences in height of less than 1 μm.
For the present research, the dataset was divided into two parts using a ⅔ versus ⅓ split: Data from 198 firearms (hereinafter the “training set”) were used to train all the models up to and including calculation of uncalibrated likelihood ratios, and data from the remaining 99 firearms (hereinafter the “calibration/validation set”) were used for cross-validated training of the calibration model and for validation.
3.3. Preprocessing
Prior to feature extraction, we applied the following commonly-used preprocessing steps:
-
1.
Segmentation: Separation of the firing-pin impression and the breech-face region from the remainder of the image and from each other.
-
2.
Illumination correction: Correction for non-uniformities in illumination, including planar-bias correction.
-
3.
Noise removal: Removal of imaging artifacts.
-
4.Registration.
- Rotation and alignment
- Details of commonly-used preprocessing procedures are provided in Tai & Eddy [5]. Preprocessing is not a focus of the present paper, so we do not provide details here. For segmentation, whereas Tai & Eddy [5] uses thresholds based on individual pixel values with predetermined threshold values, we used adaptive thresholds based on smoothed contours.12
The following regions were segmented:
-
(a)
the whole of the region of interest including flowback if present
-
(b)
the whole of the region of interest excluding flowback if present
-
(c)
the firing-pin impression alone
-
(d)
the breech-face region alone
Fig. 3 shows examples of each of these segmented regions.
Fig. 3.
Examples of segmented regions of a cartridge case (using the same example image as in Fig. 1): (a) whole region of interest including flowback; (b) whole region of interest excluding flowback; (c) firing-pin impression alone; (d) breech-face region alone. For the oblique views, the z scale is exaggerated by a factor of 5.
Although flowback has usually been excluded from analysis (Ott et al., [12]; Song et al., [21]), we hypothesized that the flowback region would contain useful information related to the firearm that fired the cartridge.
The output of preprocessing were matrices of values with a resolution of 56 samples per mm in each of the direction and the direction (the downsampling procedure included anti-aliasing low-pass filtering). Within each matrix, the and values were centred by subtracting their means (calculated over the whole of the segmented region), and were scaled such that the entire segmented region fell within a unit circle: . This resulted in and values in the range with in the centre. values that corresponded to and combinations that fell outside the segmented region did not contribute to the calculation of the feature values (these values were coded in Matlab as “not a number, NaN”). values were scaled in millimetres, and were shifted so that the origin (zero value) was set to the plane fitted to the breech-face region during planar-bias correction (planar-bias correction was derived from the breech-face region only and applied to each of the segmented regions).
Because of the preprocessing, all data matrices had the same scale and the same location. Some of the features extracted for the present research are rotation invariant, but others are not. As part of preprocessing, we therefore rotated the data matrices. An ideal rotation procedure would use the questioned-source cartridge case in Scenario 2/one of the questioned-source cartridge cases in Scenario 1 as the target and rotate all other data matrices used for training and testing to that target. This, however, would require each data matrix in the entire dataset to be independently rotated to each questioned-source cartridge case/to each cartridge case used in validation as if it were a questioned-source cartridge case. This would be prohibitive in terms of processing time. We therefore arbitrarily selected one cartridge case from our dataset (the one shown in Fig. 1 and Fig. 3), rotated data from all other cartridge cases to this arbitrary target, then used this single rotated dataset for training and validation. The cost of rotation, especially ideal rotation, is a reason to prefer rotation-invariant features, but if rotation leads to substantial improvement in performance that cost may be justified.
3.4. Feature extraction
3.4.1. Introduction
We extracted and tested the same sets of functional features that have previously been proposed and applied in the published literature on forensic comparison of fired cartridge cases (see §1.4.3). We also extracted and tested Zernike moments (Zernike [59]; Teague [60]; Khotanzad & Hong [61]). Zernike moments have been widely used in many fields, including optometry, photonics, astronomy, and facial-expression analysis (e.g., Iskander et al. [62]; Sun et al., 2014 [63]; Pinhasi et al., [64]; Vretos et al. [65]). They are orthogonal and rotation invariant and have been found to outperform other moment-based approaches in terms of noise resilience, information redundancy, reconstruction capability, and classification accuracy (e.g., Teh & Chin [66]; Khotanzad & Hong [61]; Belkasim et al. [67]). We hypothesized that using Zernike moments as part of a fired-cartridge-case comparison system would result in better performance than using any of the previously proposed functional features.13
Below, we provide details of the extraction of:
-
•
central moments (§3.4.2)
-
•
circle-moment invariants (§3.4.3)
-
•
Legendre moments (§3.4.4)
-
•
Coefficients of Fourier series fitted to concentric circles (§3.4.5)
-
•
Zernike moments (§3.4.6)
§3.4.7 provides, for each feature-extraction method, the number of features that we extracted.
3.4.2. Central moments
Central moments were previously applied to forensic comparison of fired cartridge cases in Ghani et al. [27].
Raw geometric moments have the general form given in Equation (1), in which and are non-negative integers that specify the orders of the moment, and is an arbitrary function of and . Equation (2) provides the form applicable for digital data. and are the number of discrete and discrete values respectively. Equation (3) provides the formula for calculating central moments. Since we centred our data in and during preprocessing, and and there will be no difference between our and values.
| 1 |
| 2 |
| 3 |
Fig. 4 shows the products of the power functions on and on up to order 4 plotted over a disc of unit radius. All panels in Fig. 4 through Fig. 8 are plotted with as the range for each of the , , and axes. Because the and values are in the range , as and increase, the magnitudes of the outputs of the power functions decrease toward zero. For visualization purposes, in each panel of Fig. 4 we have scaled the product of the power functions so that the maximum magnitude on the axis is 1. The plot in each panel represents a function that, if sampled at the same and values as the matrix of data values , produces a matrix of values that can be pointwise multiplied with a matrix of data values and the products summed to extract a scaled central moment that can be used as a feature value.
Fig. 4.
Plots of scaled products of power functions used in the calculation of central moments up to order 4.
Fig. 8.
Plots of Zernike polynomials used in the calculation of Zernike moments up to order 4.
Central moments are not orthogonal and are not rotation invariant.
3.4.3. Circle-moment invariants
Circle-moment invariants were previously applied to forensic comparison of fired cartridge cases in Leng & Huang [29].
Circle-moment invariants are a modified version of central moments. They have the form given in Equation (4). Whereas central moments use the signed values of the power functions on and on , circle-moment invariants use the absolute values. Fig. 5 shows the products of the absolute power functions on and on up to order 4 plotted over a disc of unit radius. For visualization purposes, we have scaled the product of the absolute power functions so that the maximum magnitude in each panel is 1.
Fig. 5.
Plots of scaled products of absolute power functions used in the calculation of circle-moment invariants up to order 4.
Circle-moment invariants are rotation invariant, but not orthogonal.
| 4 |
3.4.4. Legendre moments
Legendre moments were previously applied to forensic comparison of fired cartridge cases in Chuan et al. [28].
The previously considered moments have used a power function of and a power function of , but moments can be generalized to use other functions. Legendre moments have the form given in Equation (5), in which is a Legendre polynomial of order . Legendre polynomials up to order 4 are given in Equation (6). After the specification of the zeroth and first Legendre polynomials, higher orders in the series can be generated using Equation (7). Fig. 6 shows the scaled products of Legendre polynomials on and on up to order 4 plotted over a disc of unit radius. Legendre moments are orthogonal, but not rotation invariant.
| 5 |
| 6 |
| 7 |
Fig. 6.
Plots of scaled products of Legendre polynomials used in the calculation of Legendre moments up to order 4.
3.4.5. Concentric-circle features
Fourier series fitted to concentric circles were previously applied to forensic comparison of fired cartridge cases in Thumwarin et al. [10]. The coefficient values from the Fourier series were used as features. For brevity, we refer to these features as “concentric-circle features”.
Imagine a circle of radius and a function where specifies the angle in radians around the circumference of the circle. Fix , and fit a Fourier series to the function with the first-order component being a cosine with a period of radians. All non-zeroth components will be cosines whose periods are radians where is a positive integer. Each component will therefore complete an integer number of periods as it travels around the circumference of the circle and will meet itself exactly in phase. The function can be reconstructed to order using a Fourier series as in Equation (8), in which is the mean value of , and is the magnitude coefficient and the phase coefficient of component of the series. Cosine functions up to with zero phase ( for all ) fitted to a unit-radius circle () are plotted in the top row of Fig. 7. Other rows of Fig. 7 show cosine functions up to successively lower values fitted to successively smaller circles. Across the rows of Fig. 7, the period () of the highest order cosine function is the same.
| 8 |
Fig. 7.
Cosine components of a Fourier series up to component 4 ( = 4) fitted to a unit-radius circle (), and cosine components up to successively lower fitted to successively smaller circles.
As in Thumwarin et al. [10], we only extracted concentric-circle features from the whole region of interest including flowback. In order to fit Fourier series covering the segmented region of interest, we specified the radius of each member of a series of concentric circles. values were selected such that circles fell entirely within the segmented region of interest. We transformed the Cartesian-coordinate data matrices, , to polar coordinates, . We then selected the data points that were closest to each concentric circle. Fourier series were fitted independently to each circle.
As the radii of the circles decrease, so do the lengths of their circumferences. As decreased, we decreased the order of the Fourier series so that, across all circles, when measured in millimetres, the period of the highest order component was the same. The reduction in with reduction in is illustrated in Fig. 7. Details of the values of and used for feature extraction in the present research are provided in §3.4.7 below.
The magnitude coefficients of a Fourier series can be used as rotation-invariant features. We henceforth refer to these features as “concentric-circle magnitude features”. We also extracted features that took account of both magnitude and phase. We henceforth refer to these features as “concentric-circle magnitude and phase features”. Phase per se is inconvenient as a feature because of discontinuity of values at . An alternative representation of a component of a Fourier series, given in Equation (9), makes use of weighted cosine and sine functions. We will use the weights and as paired features that together capture both magnitude and phase information.
| 9 |
3.4.6. Zernike moments
Zernike polynomials were described in Zernike [59], and Teague [60] and Khotanzad & Hong [61] provide introductions to Zernike moments. To our knowledge, Zernike moments have not previously been applied to forensic comparison of fired cartridge cases.
Zernike moments have the form given in Equation (10), in which is a Zernike polynomial parameterized in polar coordinates. Equation (11) provides the form for calculating Zernike moments from digital data. The constraint was already enforced by the preprocessing of our data.
| 10 |
| 11 |
Zernike polynomials are calculated as in Equation (12), which consists of a function dependent on distance from the centre of a disc of unit radius, and a function or dependent on angle around the disc. The angle function is also dependent on , which can be specified as a positive or a negative integer, or as zero. For the angle-dependent function is a cosine function, and the notation uses as a subscript. For the angle-dependent function is a sine function, and the notation uses as a subscript.
| 12 |
The are a series of orthogonal polynomial functions dependant on the values of and , see Equation (13).
| 13 |
Zernike moments are defined for even values of with the constraint that . The up to order 4 are given in Equation (14). Fig. 8 shows Zernike polynomials up to order 4 plotted over a disc of unit radius. The output of Zernike polynomials intrinsically fall in the range .
| 14 |
To calculate Zernike moments, we used the method described in Iskander et al. [62] and given in Equation (15), in which: is a data matrix rearranged into a column vector; is a matrix in which each column is a matrix of Zernike polynomial values rearranged into a column vector in the same way as for , and for which the number of columns equals to the number of Zernike moments to be extracted14; superscript indicates the transpose of the matrix; and is a column vector of estimated Zernike moments. This method is a least-squares fit assuming a model in which the data are the product of the Zernike polynomials and the moments, plus a random error, i.e., .
| 15 |
As discussed at the end of §3.4.5 and shown in Equation (9), a pair of cosine and sine functions capture both the magnitude and phase of a component of a Fourier series. Likewise a pair of Zernike polynomials and with the same and values capture both magnitude and phase information, therefore Zernike moments and with the same and values can be used as paired features that capture both magnitude and phase information.
Theoretically, Zernike moment magnitude and phase features are not rotation invariant, but Zernike moment magnitude features, , are rotation invariant.
3.4.7. Numbers of features extracted
Given the relatively small size of our dataset, we did not want to extract a very large number of features, but we wanted to extract sufficient information to obtain reasonably good performance on the cartridge-case-comparison task. We initially focussed on extracting Zernike-moment magnitude and phase features for Scenario 1, and a priori believed that extracting up to 10th order moments for the firing-pin impression and up to 20th order for the breech-face region would be a reasonable compromise. We chose a lower order for the firing-pin impression because its gross shape is usually considered an important source of information, whereas we chose a higher order for the breech-face region in order to capture finer details of surface irregularities. In preliminary tests, we also tested up to 5th and up to 15th order for the firing-pin impression, and up to 10th and up to 30th order for the breech-face region, but up to 10th and up to 20th order for the firing-pin impression and breech-face region respectively gave better or no worse results. Up to 10th order Zernike moments (up to ) result in a total of 66 magnitude and phase features, and up to 20th order (up to ) result in a total of 231 magnitude and phase features. In addition to fitting models to features extracted from the firing-pin impression alone and to features extracted from the breech-face region alone, we also fitted models to the concatenation of these two sets of features. The concatenation of firing-pin plus breech-face features contained a total of 297 features. When extracting features from the entire region of interest (either including or excluding flowback), we used up to 23rd order Zernike moments (up to ), resulting in a total of 300 magnitude and phase features. These choices as to number of features to extract are somewhat arbitrary, however, we will treat them as specifications for the system and then validate the performance of that system.
For the Zernike-moment magnitude-only features, using the same orders as stated above, 36 features were extracted from the firing pin impression, 121 from the breech-face region, and 156 from the whole region of interest.
For the other moment-based feature sets (central moments, circle-moment invariants, and Legendre moments), we extracted approximately the same number of features as we had Zernike-moment magnitude and phase features: up to 7th order (up to ) from the firing-pin impression, a total of 64 features; up to 14th order (up to ) from the breech-face region, a total of 225 features; and up to 16th order (up to ) from the whole region of interest, a total of 289 features.15
As in in Thumwarin et al. [10], for the concentric-circle features, we only extracted features from the whole region of interest. A 23rd order Fourier series was fitted to the outermost circle (circle = 1 with order = 23), matching the order of the Zernike moments. Based on measurements from the cartridge case which was used as the target for rotation, the radius of the outermost circle (the largest circle that could be drawn within the segmented region of interest) was = 1.671 mm. The circumference of that circle was therefore mm, and the period of the highest order component of the Fourier series was therefore mm. This specifies the smallest wavelength of repetitive surface irregularities in the region of interest from which these features can extract information. Additional circles were then drawn, concentric to the outermost circle but with smaller radii. Moving from the outermost to the innermost circle, the order of the Fourier series for each circle was set to be two less than that of the previous circle: , i.e., the orders were 23, 21, 19, …, 5, 3, 1 (a total of 12 circles). The radius of each circle was then calculated such that the period of the highest order component of the Fourier series fitted to that circle was equal to , i.e., . The resulting radii were 100, 91, 83, 74, 65, 57, 48, 39, 30, 22, 13, and 4% of the radius of the outermost circle. The pattern of reduction in with reduction in is illustrated in Fig. 7, but with orders 4, 3, 2, 1. Starting with = 23 and reducing in steps of 2, the total number of concentric-circle magnitude and phase features was 300, and the total number of concentric-circle magnitude-only features was 156.
When extracting moments from the whole of the region of interest excluding flowback, data within the flowback region were not used in the calculations (in Matlab, data within the flowback region were coded as “not a number, NaN”).
3.5. Statistical models
3.5.1. Introduction
This subsection provides details of the statistical modelling pipeline previously outlined in §3.1 and Fig. 2, i.e., dimension reduction (§3.5.2), calculation of uncalibrated likelihood ratios (§3.5.3), and calibration (§3.5.4).
3.5.2. Dimension reduction
Given the relatively small size of the dataset, in order to reduce the number of parameter values to be estimated in the next stage of modelling and in order to reduce potential redundancy of information among features, we reduced the number of feature dimensions using principal component analysis (PCA; Pearson [71]; Hotelling [72]). For each feature set, for the firing-pin impression we reduced the number of dimensions to 10, for the breech-face region to 20, and for the whole region of interest (including or excluding flowback, and including the concatenation of features from the breech-face-region and the firing-pin-impression) to 30. These values were chosen as a compromise between trying not to discard potentially useful information and trying not to have too many dimensions relative to the number of firearms available for training the between-source covariance matrix in the next stage of modelling.
After dimension reduction using PCA, we calculated linear discriminant functions (LDFs; Fisher [73]; Rao [74]) on the training data and transformed all the data into the LDF space. For LDF training, each individual firearm (each source), constituted a category. We did not use LDFs for additional dimension reduction, but only to rotate the data into orthogonal dimensions that maximized between-source versus within-source variance ratios. In preliminary tests, using LDFs for dimension reduction led to worse results. If there are mismatches in conditions between the questioned-source item and the known-source item (as is common in forensic voice comparison), then the mismatch in conditions can be the cause of substantial within-source variability. In this circumstance, training LDFs on data that include the mismatch and using only the lower-order dimensions serves as a mismatch-compensation technique: the lower-order dimensions have higher ratios of between-source to within-source variance, including within-source variance due to mismatched conditions, than do the higher-order dimensions. Since all the cartridge cases in our dataset had brass primer cups, there was no within-source mismatch in conditions for the training data or for the calibration/validation data, hence no mismatch-compensation advantage to be gained from dimension reduction using LDFs.16
3.5.3. Calculation of uncalibrated likelihood ratios
Uncalibrated likelihood ratios were calculated using a common-source likelihood-ratio model known in the automatic-speaker-recognition literature as the two-covariance version of probabilistic linear discriminant analysis (PLDA; Prince & Elder [75]; Kenny [76]; Brümmer & de Villiers [77]; Sizov et al. [78]).17 We used the implementation from Sizov et al. [78]. The form of the model is as given in Equation (16), in which is an uncalibrated likelihood ratio, is a multivariate Gaussian probability-density function, and are post-PCA-LDF questioned-source and known-source feature vectors respectively, is the estimate of the mean vector for the relevant population, and and are, respectively, the within-source covariance matrix and the between-source covariance matrix estimates for the relevant population.
| 16 |
For each segmented region, we trained three different PLDA models, which differed in their values:
Model 1 v 1 corresponds to Scenario 1.
A pooled was calculated using all feature vectors from all sources in the training data.
Model 1 v 3 corresponds to Scenario 2 and assumes the practitioner fired 3 cartridges from the seized firearm.
From the 10 feature vectors of each source (corresponding to the 10 cartridge cases from each firearm), there are possible combinations of 3 feature vectors. 10 of these combinations were randomly selected, and the mean vector for each of these combinations was calculated. A pooled was then calculated using the combination of all the original singleton feature vectors and all the three-mean feature vectors from all sources in the training data.
Model 1v 9 corresponds to Scenario 2 and assumes the practitioner fired 9 cartridges from the seized firearm.
From the 10 feature vectors of each source (corresponding to the 10 cartridge cases from each firearm), all possible combinations of 9 feature vectors were drawn, and the mean vector for each of these combinations was calculated.18 A pooled was then calculated using the combination of all the original singleton feature vectors and all the nine-mean feature vectors from all sources in the training data.
Model 1 v 1, Model 1 v 3, and Model 1 v 9 will have successively smaller-valued within-source covariance matrices, the latter two reflecting the size of the group of known-source cartridge cases that will be compared with the questioned-source cartridge case.
The mean vector for each source in the training data was calculated using, as applicable for each model, all the original singleton feature vectors from that source, or all the original singleton feature vectors from that source plus all the three-mean or all the nine-mean feature vectors belonging to that source. and were then calculated using all of the mean vectors from each source.
Prior to training the PLDA model, independently for each feature-vector dimension, the training data were centred to 0 and were scaled to a standard deviation of 1. These transformations, obtained from the mean and standard deviation of the training data, were subsequently applied to the calibration/validation data. Given this centring and scaling, for Model 1 v 1 and Model 1 v 9, should be a vector of zeros, the diagonal of should be a vector of ones, and the values of should be the same for both models. These values will differ slightly for Model 1 v 3 because of the random sub-selection of data used in training that model.
3.5.4. Calibration
Whereas the model used to calculate uncalibrated likelihood ratios requires the estimation of a large number of parameter values in a multivariate data space, a calibration model is a parsimonious model which requires the estimation of a small number of parameters in a univariate space. The ratio of parameter values to be estimated relative to the number of data points is therefore much smaller for the latter model than for the former.
We calibrated the uncalibrated likelihood ratios using a logistic-regression model. Logistic regression is commonly used as a calibration model in forensic voice comparison (González-Rodríguez et al. [80]; Morrison [81]). We used the regularized-logistic-regression model described in Morrison & Poh [82], with a regularization weight equivalent to a set of feature vectors from one firearm.19
For each segmented region and for each PLDA model (Model 1 v 1, Model 1 v 3, and Model 1 v 9), we trained different calibration models. Each calibration model was trained using a set of same-source scores and a set of different-source scores, where: a “score” is an uncalibrated log likelihood ratio, 20; a same-source score, , is the logged output of a PLDA model when the input is a pair of feature vectors originating from different cartridge cases fired from the same firearm ( versus with ); and a different-source score, , is the logged output of a PLDA model when the input is a pair of feature vectors originating from cartridge cases fired from different firearms ( versus with ). To calculate scores for training the calibration model, versus pairs were entered into Equation (16), with in place of and with in place of . Given a set of same-source scores and a set of different-source scores, a logistic regression model was trained using an iterative procedure (conjugate-gradient method; Hestenes & Stiefel [83]; Minka [84]) that estimated values for the intercept and slope coefficients and of Equation (17), which was then used to convert each uncalibrated log likelihood ratio, , to a calibrated log likelihood ratio, .21
| 17 |
Same-source pairs of feature vectors ( versus with ) and different-source pairs of feature vectors ( versus with ) for training each calibration model (and for cross-validation) were constructed as follows:
Model 1 v 1: To create same-source pairs of feature vectors, all possible combinations of 2 feature vectors were drawn from the 10 feature vectors originating from a firearm. One of the feature vectors in each pair was assigned to and the other to (the Model 1 v 1 PLDA model is symmetrical so the order of assignment is irrelevant). This resulted in 45 pairs of same-source feature vectors from each firearm. To create different-source pairs of feature vectors, each feature vector from each firearm was compared with each feature vector from every other firearm. This resulted in 100 pairs of different-source feature vectors from each pair of firearms.
Model 1 v 3: To create same-source pairs of feature vectors, each of the 10 feature vectors originating from a firearm was selected in turn, and the selected singleton feature vector was assigned to . From the remaining 9 feature vectors of each firearm, using random selection without replacement, 3 non-overlapping combinations of 3 feature vectors were drawn, and the mean vector of each combination was in turn assigned to . This resulted in 30 pairs of same-source feature vectors from each firearm. To create different-source pairs of feature vectors, each feature vector from each firearm was compared with each of the mean vectors of 3 non-overlapping randomly selected combinations of 3 feature vectors from each of the other firearms. The combinations of 3 feature vectors were randomly selected without replacement from the total of 10 feature vectors from the second firearm (one of the feature vectors was not used). A different random selection from the second firearm was used for comparison with each of the singleton feature vectors from the first firearm. The singleton feature vector was assigned to and each of the three-mean vectors was in turn assigned to . This resulted in 30 pairs of different-source feature vectors from each pair of firearms (with versus counted as a different pair to versus ).
Model 1 v 9: To create same-source pairs of feature vectors, each of the 10 feature vectors originating from a firearm was selected in turn, the selected singleton feature vector was assigned to , and the mean vector of the other 9 feature vectors was assigned to . This resulted in 10 pairs of same-source feature vectors from each firearm. To create different-source pairs of feature vectors, each feature vector from each firearm was compared with the mean of each of the possible combinations of 9 feature vectors from every other firearm. The singleton feature vector was assigned to and the nine-mean vector to . This resulted in 100 pairs of different-source feature vectors from each pair of firearms (with versus counted as a different pair to versus ).
In addition to separately calibrating the scores from each of the firing-pin impression and the breech-face region, we also used a logistic-regression model to simultaneously fuse and calibrate scores from these two regions. The scores were parallel in that each firing-pin-impression score corresponded to a breech-face-region score that was calculated using the same combination of digital images (including for Model 1 v 3, the same random selections of images). Given a parallel set of same-source and different-source scores, a regularized-logistic-regression model was trained resulting in estimated values for the intercept and for two slope coefficients and . These coefficient values were then used to fuse and calibrate a parallel pair of scores, extracted from the firing-pin impression and extracted from the breech-face region, as in Equation (18).
| 18 |
Calibration and validation were performed together using cross-validation (see §4.2 for details).
4. Validation
4.1. Introduction
A system validation was conducted for each different feature-extraction method applied to each different segmented region. Validation was conducted according to the relevant recommendations in the Consensus on validation of forensic voice comparison (Morrison et al. [85]). In this section, we describe the validation procedures (§4.2), and the metric (log-likelihood-ratio cost, Cllr; §4.3) and graphic (Tippett plot; §4.4) used to represent the results.
4.2. Validation procedures
Calibration and validation were performed using cross-validation, comparing feature vectors from each firearm with other feature vectors from the same firearm and with feature vectors from all the other firearms in the calibration/validation set.
Considering a matrix of all possible combinations of two cartridge cases: Since Model 1 v 1 is symmetrical, the same-source comparisons were those on the diagonal of the matrix and the different-source comparisons were those on the upper right of the matrix (or those on the bottom left, but not both). Since Model 1 v 9 and Model 1 v 3 are not symmetrical, the same-source comparisons were those on the diagonal of the matrix and the different-source comparisons were those on both the upper right and the lower left of the matrix.
Leave-one-source-out/leave-two-sources-out cross-validation was used: In a cross-validation loop in which the score to be calibrated was a same-source score, e.g., the result of comparing a cartridge case fired from firearm A with another cartridge case fired from firearm A, all scores that resulted from comparisons in which one or both members of the pair was a cartridge case fired from firearm A were excluded from the data used to train the calibration model (leave-one-source-out). In a cross-validation loop in which the score to be calibrated was a different-source score, e.g., the result of comparing a cartridge case fired from firearm A with a cartridge case fired from firearm B, all scores that resulted from comparisons in which one or both members of the pair was a cartridge case fired from firearm A or a cartridge case fired from firearm B were excluded from the data used to train the calibration model (leave-two-sources-out).
4.3. Validation metric: Log-likelihood-ratio cost (Cllr)
Given a same-source input, a good output from a forensic-evaluation system would be a likelihood-ratio value that is much larger than 1, a less good output would be a value that is only a little larger than 1, a bad output would be a value less than 1, and a worse output would be a value much less than 1. Mutatis mutandis, given a different-source input, a good output would be a value much less than 1.
A metric that captures this gradient goodness is the log-likelihood-ratio cost (Cllr; Brümmer & du Preez [86]), which is calculated as in Equation (19), in which and are likelihood-ratio outputs corresponding to same-source and different-source input pairs respectively, and and are the number of same-source and different-source input pairs respectively.
| 19 |
Lower Cllr values indicate better performance. Cllr values cannot be less than 0. A system that always responded with a likelihood ratio of 1 irrespective of the input, and hence gave no useful information, would have a Cllr value of 1. A system with a Cllr of less than 1 is providing useful information. Cllr values substantially greater than 1 can be produced by uncalibrated or miscalibrated systems.
For further explanation of Cllr and its interpretation, see Appendix C of Morrison et al. [85].
4.4. Validation graphic: Tippett plot
Tippett plots (Meuwly [87]) consist of plots of the empirical cumulative probability distributions of the same-source log-likelihood-ratio values and of the different-source log-likelihood-ratio values. The tradition is to plot lines joining the data points rather than to plot the data points themselves. Tippett plots of some of the results of the present study are provided in Fig. 9 below. The y-axis values corresponding to the curves rising to the right give the proportion of same-source test results with log likelihood-ratio values less than or equal to the corresponding value on the x-axis. The y-axis values corresponding to the curves rising to the left give the proportion of different-source test results with log likelihood-ratio values greater than or equal to the corresponding value on the x-axis. In general, shallower curves with greater separation between the two curves indicates better performance. Tippett plots give an indication of the range of possible likelihood-ratio values that the system could generate under the test conditions, and can also reveal problems such as bias in the output.
Fig. 9.
Tippett plots of validation results obtained using Zernike moment magnitude and phase features extracted from the whole of the region of interest including flowback. (a) Model 1 v 1. (b) Model 1 v 3. (c) Model 1 v 9.
For further explanation of Tippett plots and their interpretation, see Appendix C of Morrison et al. [85].
5. Results
5.1. Introduction
In this section, we present the validation results, including Cllr values (§5.2) and selected Tippett plots (§5.3).22
5.2. Cllr values
Table 1, Table 2, and Table 3 provide Cllr values obtained from the validations of Model 1 v 1, Model 1 v 3, and Model 1 v 9 respectively. Each table provides Cllr values from the factorial of combinations of feature set and segmented region (including using feature concatenation and score-level fusion to combine features extracted separately from the breech-face region and the firing-pin impression). Examining these tables, a clear pattern of results emerges:
-
1.
The feature set resulting in best performance is the Zernike moment magnitude and phase feature set.
-
2.
The segmented region resulting in best performance is the whole of the region of interest including flowback.
-
3.
The larger the size of the group of known-source cartridge cases that was compared to the questioned-source cartridge case, the better the performance.
Table 1.
Cllr values for each combination of feature set and segmented region for Model 1 v 1.
| Segmented Region |
||||||
|---|---|---|---|---|---|---|
| Feature Set | whole region of interest |
breech face | firing pin | breech face + firing pin |
||
| including flowback | excluding flowback | feature concat. | score-level fusion | |||
| central moments | 0.616 | 0.671 | 0.710 | 0.923 | 0.682 | 0.697 |
| circle-moment invariants | 0.597 | 0.673 | 0.695 | 0.962 | 0.677 | 0.693 |
| Legendre moments | 0.577 | 0.679 | 0.719 | 0.923 | 0.709 | 0.707 |
| concentric-circle features (mag.) | 0.586 | – | – | – | – | – |
| concentric-circle features (mag. & phase) | 0.526 | – | – | – | – | – |
| Zernike moments (mag.) | 0.531 | 0.652 | 0.684 | 0.852 | 0.615 | 0.632 |
| Zernike moments (mag. & phase) | 0.519 | 0.645 | 0.689 | 0.841 | 0.605 | 0.635 |
Table 2.
Cllr values for each combination of feature set and segmented region for Model 1 v 3.
| Segmented Region |
||||||
|---|---|---|---|---|---|---|
| Feature Set | whole region of interest |
breech face | firing pin | breech face + firing pin |
||
| including flowback | excluding flowback | feature concat. | score-level fusion | |||
| central moments | 0.491 | 0.527 | 0.574 | 0.858 | 0.537 | 0.553 |
| circle-moment invariants | 0.467 | 0.532 | 0.557 | 0.901 | 0.537 | 0.551 |
| Legendre moments | 0.448 | 0.538 | 0.583 | 0.845 | 0.571 | 0.563 |
| concentric-circle features (mag.) | 0.435 | – | – | – | – | – |
| concentric-circle features (mag. & phase) | 0.390 | – | – | – | – | – |
| Zernike moments (mag.) | 0.390 | 0.502 | 0.547 | 0.752 | 0.459 | 0.476 |
| Zernike moments (mag. & phase) | 0.384 | 0.498 | 0.550 | 0.730 | 0.449 | 0.478 |
Table 3.
Cllr values for each combination of feature set and segmented region for Model 1 v 9.
| Segmented Region |
||||||
|---|---|---|---|---|---|---|
| Feature Set | whole region of interest |
breech face | firing pin | breech face + firing pin |
||
| including flowback | excluding flowback | feature concat. | score-level fusion | |||
| central moments | 0.485 | 0.497 | 0.542 | 0.843 | 0.506 | 0.534 |
| circle-moment invariants | 0.465 | 0.494 | 0.527 | 0.913 | 0.524 | 0.529 |
| Legendre moments | 0.420 | 0.501 | 0.549 | 0.822 | 0.546 | 0.534 |
| concentric-circle features (mag.) | 0.416 | – | – | – | – | – |
| concentric-circle features (mag. & phase) | 0.363 | – | – | – | – | – |
| Zernike moments (mag.) | 0.359 | 0.441 | 0.493 | 0.699 | 0.406 | 0.421 |
| Zernike moments (mag. & phase) | 0.351 | 0.450 | 0.508 | 0.678 | 0.401 | 0.430 |
5.3. Tippett plots
Fig. 9 provides Tippett plots of validation results obtained using Zernike moment magnitude and phase features extracted from the whole of the region of interest including flowback. The results show good calibration, with the same-source and different-source curves crossing near . For all models, likelihood-ratio values into the thousands in favour of same source would be supported, and likelihood-ratio values into the tens of thousands in favour of different source would be supported. The larger the number of known-source cartridge cases used, the more negative the limit of different-source log-likelihood-ratio values obtained (the Tippett plots are truncated at , hence not all values are shown). Related to this pattern, substantial asymmetry is apparent for the same-source versus different-source log-likelihood-ratio values from Model 1 v 9. This increasing asymmetry is the expected and commonly observed pattern as within-source variability becomes much less than between-source variability.23
6. Discussion
6.1. Introduction
In this section, based on the results, we discuss:
-
•
what we consider to be the best feature set (§6.2)
-
•
the benefit of rotating the data matrices to a common target (§6.3)
-
•
the benefit of using more known-source cartridge cases (§6.4)
-
•
what we consider to be the best segmentation of the cartridge case base (§6.5)
6.2. Best feature set
In §3.4.1, on theoretical grounds and based on empirical results of applications in other fields, we hypothesized that using Zernike moments as features would result in better performance than using any of the other feature sets previously proposed in the literature on forensic comparison of fired cartridge cases.
Our results demonstrated that this was indeed the case with respect to other moment-based feature sets. For the whole region of interest including flowback, compared to Legendre moment features (the best performing non-Zernike moment-based feature set), Cllr values for Zernike moment magnitude and phase features were lower by 10%, 14%, and 16% for Model 1 v 1, Model 1 v 3, and Model 1 v 9 respectively.
Compared to concentric-circle magnitude and phase features, however, Cllr values for Zernike moment magnitude and phase features were only lower by 1%, 2%, and 3% for Model 1 v 1, Model 1 v 3, and Model 1 v 9 respectively. Although, the improvement is slight, Zernike moment magnitude and phase features have the advantage of being simpler to extract.
For future work, including ultimate application to casework, we therefore consider Zernike moment magnitude and phase features to be the best feature set to use.
6.3. Benefit of rotation
In §3.3 we noted that the cost of rotating the data matrices is a reason to prefer rotation-invariant features, but, if rotation leads to substantial improvement in performance, that cost may be justified.
After performing rotation, the theoretically non-rotation-invariant Zernike moment magnitude and phase features did not consistently result in better performance than the theoretically rotation-invariant Zernike moment magnitude-only features (see Table 1, Table 2, and Table 3). For all three models, for breech-face region alone and for score fusion (and for Model 1 v 9 for the whole region of interest excluding flowback), Cllr values were actually lower for Zernike moment magnitude-only features than for Zernike moment magnitude and phase features.
For all three models, for the whole region of interest including flowback, however, Cllr values were lower for Zernike moment magnitude and phase features than for Zernike moment magnitude-only feature, albeit only by 2%.
We ran an additional set of validations without rotation, using Zernike moment magnitude-only features and Zernike moment magnitude and phase features extracted from the whole region of interest including flowback. For all combinations of model and for both magnitude-only and magnitude-and-phase features, the Cllr values for rotated versus non-rotated image data were less than 1% different.24 Thus, even if not theoretically rotation invariant, the magnitude and phase features in practice gave equally good results irrespective of whether rotation was applied to the data matrices or not.
For future work, including ultimate application to casework, we therefore consider the cost of performing rotation to be not justified.
6.4. Benefit of using more known-source cartridge cases
In §3.5.3 we described using different numbers of known-source cartridge cases for training, resulting in Model 1 v 1, Model 1 v 3, and Model 1 v 9 having successively smaller-valued within-source covariance matrices. The expected result of this is that models with smaller ratios of within-source versus between-source covariance matrix magnitudes will produce a larger range of log-likelihood-ratio values, extending from higher-magnitude negative log-likelihood-ratio values for different-source comparisons to higher-magnitude positive log-likelihood-ratio values for same-source comparisons.
For Zernike moment magnitude and phase features extracted from the whole region of interest including flowback, the results were as expected for different-source comparisons, but not so for same-source comparisons (see Fig. 9). Increasing the number of known-source cartridge cases clearly improved results for different-source comparisons but did not clearly do so for same-source comparisons: in Fig. 9c the largest same-source log-likelihood-ratio value was actually less than in Fig. 9a and Fig. 9b.
Although the 9% reduction in Cllr values for Model 1 v 9 compared to Model 1 v 3 (0.351 compared to 0.384) appears to be substantial, if it is primarily due to large-magnitude negative log likelihood ratios from different-source comparisons getting even more negative, the increase in performance indicated by the Cllr values may not be particularly pertinent in casework.
In the context of casework, in which firing 3 cartridge cases from a seized firearm is currently the norm, the cost of firing 9 cartridge cases instead may not be justified. This is an issue to revisit once a larger database is collected and potentially better performing systems are developed.
For training and validation purposes, we recommend firing 10 cartridge cases from each firearm. These can be used to make multiple sets of data for training and validating Model 1 v 3.
6.5. Best segmentation
As mentioned in §1.3, in current casework practice, practitioners tend to visually compare the firing-pin impressions and the breech-face regions of pairs of fired cartridge cases. As mentioned in §1.4.3, in previous research using data, quantitative measurements, and statistical models, flowback has usually been excluded from analysis. In §3.3, however, we hypothesized that the flowback region would contain information related to the firearm that fired the cartridge.
Combining information from the firing-pin impression and the breech-face region was expected to result in better performance than using one of these alone, and that result was obtained: in Table 1, Table 2, and Table 3 it can be observed that any of the means tested for combining firing-pin-impression and breech-face-region information almost always resulted in lower Cllr values than using either of these alone.25
As we hypothesized, however, the best performance was obtained by extracting features from the whole region of interest, including not only the breech-face region and the firing-pin impression, but also the flowback region. It therefore appears that, contrary to received wisdom, the flowback region does contain useful information about the firearm that fired the cartridge case.
Practically, only having to segment the region of interest from the headstamp region, and not having to additionally segment the firing-pin impression and the breech-face region will result in a simpler and faster system for comparing fired cartridge cases.
For future work, including ultimate application to casework, we therefore consider the whole region of interest including flowback to be the best segmented region to use.
7. Conclusion
The present paper described and validated a feature-based system for calculation of likelihood ratios from 3D digital images of fired cartridge cases. The system includes a database of 3D digital images of the bases of approximately 3,000 fired cartridge cases, consisting of 10 cartridges fired per firearm from approximately 300 firearms of the same class (semi-automatic pistols that fire 9 mm diameter centre-fire Luger-type ammunition, and that have hemispherical firing pins and parallel breech-face marks). The images were captured using Evofinder®, an imaging system that is commonly used by operational forensic laboratories. Although in terms of the combination of number of firearms of the same class and number of fires per firearm, this may be one of the largest databases in existence, we consider it relatively small for training statistical models that take account of both within-source and between-source variability. Given this relatively small database, we were encouraged by the relatively good validation results.
An important component of the research reported in the present paper was the comparison of different methods for feature extraction. Key conclusions were:
-
•
Of the feature sets tested, the best performance was achieved using Zernike moment magnitude and phase features.
-
•
Performance of Zernike moment magnitude and phase features was equally good irrespective of whether the data matrices were rotated prior to feature extraction or not. Use of costly rotation procedures is therefore not necessary.
-
•
The best performance was achieved by directly extracting features from the whole of the region of interest (the firing-pin impression plus the flowback region plus the breech-face region), rather than by any process that involved separately segmenting the firing-pin impression and the breech-face region.
-
•
In the context of casework involving comparison of a fired cartridge case recovered from a crime scene with cartridges fired from a seized firearm, using 3 cartridges fired from the seized firearm would appear to be sufficient to achieve good results. Use of a larger number of fires per firearm would, however, be advisable for system training and validation.
In future work aimed at developing better performing systems, we will therefore use Zernike moment magnitude and phase features extracted from the whole of the region of interest without rotation of data matrices prior to feature extraction.
Planned future work includes expanding the size of the database to the point where it will be sufficient for training a DNN-embedding based system, which is currently the state-of-the-art approach in forensic voice comparison, and which is expected to lead to substantial improvements in system performance. Planned future work will also ultimately include field testing by practitioners of a later version of the system.
Disclaimer
All opinions expressed in the present paper are those of the authors, and, unless explicitly stated otherwise, should not be construed as representing the policies or positions of any organizations with which the authors are associated.
Author contributions
Nabanita Basu: Conceptualization, Formal analysis, Methodology, Software, Validation, Visualization, Writing - Original Draft, Writing - Review & Editing. Rachel S Bolton-King: Conceptualization, Data curation, Investigation, Methodology, Supervision, Writing - Review & Editing. Geoffrey Stewart Morrison: Conceptualization, Funding acquisition, Methodology, Supervision, Visualization, Writing - Original Draft, Writing - Review & Editing.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
This research was supported by Research England's Expanding Excellence in England Fund as part of funding for the Aston Institute for Forensic Linguistics 2019–2023.
Thanks to Dr Michael Derenovskiy and his colleagues at ScannBI Technology Europe GmbH for the loan of the Evofinder® imaging system.
Thanks to the organizations and individuals who donated the fired cartridge cases. To maintain their anonymity, we do not thank them by name.
For simplicity, we assume centre-fire cartridges. Some firearms use rim-fire cartridges, and a firing-pin impression appears on the edge of the base of the fired cartridge case rather than on a central primer cup.
Breech designs vary, but a common design is for there to be a breech block, i.e., a block of metal, that halts the backward motion of the cartridge case.
The design of many firearms allow firing pins and breech faces to be replaced, but for simplicity the present paper does not address scenarios involving such changes.
A comparison microscope allows images of two different objects to be juxtaposed and rotated and aligned relative to one another.
In Zhang & Luo [17], 3070 test fires were produced from a total of 5 firearms. In Law et al. [18], 100 test fires were produced from each of 30 firearms.
“Cells” on the known-source image can be of any orientation in any location and do not have to tessellate with each other.
See Ommen & Saunders [54] on the distinction between specific-source and common-source likelihood ratios.
It is not always the case that a firearm that has parallel breech-face marks will clearly transfer those marks to the breech-face region of the cartridge case. If a questioned-source cartridge case has clear parallel marks on its breech-face region, then the class of firearms can be restricted to those with parallel breech-face marks. If the questioned-source cartridge case does not have a clear pattern of marks on its breech-face region, then the class of firearms could be those with parallel breech-face marks, or with circular, cross-hatch, arc, or granular breech-face marks, or with smooth breech faces. In the present research, we have simply used cartridge cases fired from the class of firearms that have parallel breech-face marks without checking whether the cartridges playing the part of questioned-source cartridge cases actually have clear parallel marks. Including this step is something we leave for potential future research. Likewise adopting a broader population for cartridge cases including those without clear patterns of breech-face marks (and collecting data from that broader population) is something we leave for potential future research.
ISO 25178–72:2017/AMD 1:2020 Geometrical product specifications (GPS) — Surface texture: Areal — Part 72: XML file format x3p — Amendment 1.
We also tested central-moment invariants (Hu [68]; Flusser [69]; Flusser & Suk [70]), but they did not perform as well as Zernike moments.
Subject to the previously stated constraints regarding , and , we extracted Zernike moments for all negative, zero, and positive for each up to the maximum order of used.
The calculation of the number of features includes moments for which or , e.g., up to 7th order is features.
In preliminary work, we tested several other dimension-reduction methods, but none outperformed the combination of PCA + LDF.
In Aitken & Lucy [79], it is called the “multivariate normal (MVN) procedure”.
Occasionally, the number of fired cartridge cases available for a firearm was 8 or 9 rather than 10, in which case the number of feature vectors available was used.
In the notation of Morrison & Poh [82]: , where , and is the number of firearms that contributed to scores that were used to train the logistic-regression model. See Morrison & Poh [82] for further explanation.
Use of the term “score” to refer to an uncalibrated log likelihood ratio is common in forensic voice comparison. Such scores, which take account of both similarity and typicality, should not be confused with similarity scores (see §1.4.4).
Natural logarithms were used for the calculations.
In addition, in order to assess the stability of the system using Zernike moment magnitude and phase features extracted from the whole of the region of interest including flowback, we performed randomization tests in which in each iteration we randomly selected a different 198 firearm training dataset versus 99 firearm calibration/validation dataset split. Based on the results, we were satisfied that the system is sufficiently stable with respect to the selection of data for such splits. For brevity, we do not include the results here.
Compare, for example, the score distributions in Fig. 10a and b, and 16 of Morrison & Poh [82], and the corresponding Tippett plots in Figs. 11, 12 and 17 of Morrison & Poh [82].
For Model 1 v 1, Model 1 v 3, and Model 1 v 9 respectively, for magnitude-only features the Cllr values were 0.529, 0.387, and 0.357 without rotation, compared to 0.531, 0.390, and 0.359 with rotation, and for magnitude and phase features the Cllr values were 0.520, 0.384, and 0.348 without rotation, compared to 0.519, 0.384, and 0.351 with rotation.
There were a couple of exceptions for circle-moment invariants.
We plan to publish details of these modified procedures elsewhere, along with comparisons of results of segmentation using the original Tai & Eddy [5] procedures and our modified procedures.
References
- 1.Morrison G.S. Advancing a paradigm shift in evaluation of forensic evidence: the rise of forensic data science. Forensic Sci. Int.: Synergy. 2022;4 doi: 10.1016/j.fsisyn.2022.100270. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Bolton-King R.S. Preventing miscarriages of justice: a review of forensic firearm identification. Sci. Justice. 2016;56:129–142. doi: 10.1016/j.scijus.2015.11.002. [Corrigendum: (2018) 58, 83.] [DOI] [PubMed] [Google Scholar]
- 3.Nichols R. Academic Press; London, UK: 2018. Firearm and Tool Mark Identification: the Scientific Reliability of the Forensic Science Discipline. [DOI] [Google Scholar]
- 4.Tobin W.A., Blau P. Hypothesis testing of the critical underlying premise of discernible uniqueness in firearms-toolmarks forensic practice. Jurimetrics. 2013;53:121–142. https://ssrn.com/abstract=2185742 [Google Scholar]
- 5.Tai X.H., Eddy W.F. 2020. Automatically Matching Topographical Measurements of Cartridge Cases Using a Record Linkage Framework.http://arxiv.org/abs/2003.00060 [Google Scholar]
- 6.Smith T.P., Smith G.A., Snipes J.B. A validation study of bullet and cartridge case comparisons using samples representative of actual casework. J. Forensic Sci. 2016;61:939–946. doi: 10.1111/1556-4029.13093. [DOI] [PubMed] [Google Scholar]
- 7.Mattijssen E.J.A.T., Witteman C.L.M., Berger C.E.H., Brand N.W., Stoel R.D. Validity and reliability of forensic firearm examiners. Forensic Sci. Int. 2020;307 doi: 10.1016/j.forsciint.2019.110112. [DOI] [PubMed] [Google Scholar]
- 8.Mattijssen E.J.A.T., Witteman C.L.M., Berger C.E.H., Zheng X.A., Soons J.A., Stoel R.D. Firearm examination: examiner judgments and computer-based comparisons. J. Forensic Sci. 2021;66:96–111. doi: 10.1111/1556-4029.14557. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Scurich N., Garrett B.L., Thompson R.M. Surveying practicing firearm examiners. Forensic Sci. Int.: Synergy. 2022;4 doi: 10.1016/j.fsisyn.2022.100228. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Thumwarin P., Prasit C., Boonbumroong P., Matsuura T. Proceedings of the 2008 23rd International Conference Image and Vision Computing New Zealand. 2008. Firearm identification based on FIR system characterizing rotation invariant feature of cartridge case image. [DOI] [Google Scholar]
- 11.Liong C.Y., Ghani N.A.M., Kamaruddin S.B.A., Jemain A.A. Firearm classification based on numerical features of the firing pin impression. Procedia Comput. Sci. 2012;13:144–151. doi: 10.1016/j.procs.2012.09.123. [DOI] [Google Scholar]
- 12.Ott D., Thompson R., Song J. Applying 3D measurements and computer matching algorithms to two firearm examination proficiency tests. Forensic Sci. Int. 2017;271:98–106. doi: 10.1016/j.forsciint.2016.12.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Addinall K., Zeng W., Bills P., Wilcock P.T., Blunt L. The effect of primer cap material on ballistic toolmark evidence. Forensic Sci. Int. 2019;298:149–156. doi: 10.1016/j.forsciint.2019.02.054. [DOI] [PubMed] [Google Scholar]
- 14.Xin L.P., Zhou J., Rong G. Proceedings of the 5th International Conference on Signal Processing/Proceedings of the 16th World Computer Congress. 2000. A cartridge identification system for firearm authentication. [DOI] [Google Scholar]
- 15.Legrá A., Marañón E., Pérez H., de la Torre L., Quintana A., Quirós R. Proceedings of the 7th WSEAS International Conference on Applied Computer and Applied Computational Science (ACACOS’08) 2008. Automatic identification of weapons from images of the cartridge case head; pp. 236–241.https://www.researchgate.net/publication/236109928 [Google Scholar]
- 16.Fadul T.G., Jr., Hernández G.A., Stoiloff S., Gulati S. 2012. An Empirical Study to Improve the Scientific Foundation of Forensic Firearm and Tool Mark Identification Utilizing 10 Consecutively Manufactured Slides.https://www.ojp.gov/pdffiles1/nij/grants/237960.pdf Report for National Institute of Justice Award Number 2009-DN-BX-K230. [Google Scholar]
- 17.Zhang K., Luo Y. Slight variations of breech face marks and firing pin impressions over 3070 consecutive firings evaluated by Evofinder®. Forensic Sci. Int. 2018;283:85–93. doi: 10.1016/j.forsciint.2017.11.035. [DOI] [PubMed] [Google Scholar]
- 18.Law E.F., Morris K.B., Jelsema C.M. Determining the number of test fires needed to represent the variability present within 9mm Luger firearms. Forensic Sci. Int. 2017;276:126–133. doi: 10.1016/j.forsciint.2017.04.019. [DOI] [PubMed] [Google Scholar]
- 19.Lightstone L. The potential for and persistence of subclass characteristics on the breech faces of SW40VE Smith and Wesson Sigma pistols. Assoc. Firearm Tool mark Exam. J. 2010;42(4):308–322. [Google Scholar]
- 20.LaPorte D. An empirical and validation study of breechface marks on .380 ACP caliber cartridge cases fired from ten consecutively finished Hi-Point Model C9 pistols. Assoc. Firearm Tool mark Exam. J. 2011;43(4):303–309. [Google Scholar]
- 21.Song J., Vorburger T.V., Chu W., Yen J., Soons J.A., Ott D.B., Zhang N.F. Estimating error rates for firearm evidence identifications in forensic science. Forensic Sci. Int. 2018;284:15–32. doi: 10.1016/j.forsciint.2017.12.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Zhou J., Xin L.P., Gao D.S., Zhang C.S., Zhang D. Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2001. Automated cartridge identification for firearm authentication; pp. I–749. –I-754. [DOI] [Google Scholar]
- 23.Li D.G. Proceedings of the Sixth International Conference of Information Fusion. 2003. Image processing for the positive identification of forensic ballistics specimens; pp. 1494–1498. [DOI] [Google Scholar]
- 24.Gambino C., McLaughlin P., Kuo L., Kammerman F., Shenkin P., Diaczuk P., Petraco N., Hamby J., Petraco N.D.K. Forensic surface metrology: tool mark evidence. Scanning. 2011;33:272–278. doi: 10.1002/sca.20251. [DOI] [PubMed] [Google Scholar]
- 25.Petraco N.D.K., Chan H., De Forest P.R., Diaczuk P., Gambino C., Hamby J., Kammerman F.L., Kamrath B.W., Kubic T.A., Kuo L., McLaughlin P., Petillo G., Petraco N., Phelps E.W., Pizzola P.A., Purcell D.K., Shenkin P. Report for National Institute of Justice Award; 2011. Application of Machine Learning to Toolmarks: Statistically Based Methods for Impression Pattern Comparisons.https://www.ncjrs.gov/pdffiles1/nij/grants/239048.pdf Number 2009-DN-BX-K041. [Google Scholar]
- 26.Pan Y., Chen Z., Tong M., Zhao X. Proceedings of the 11th International Conference on Computer Science & Education (ICCSE) 2016. Extraction of individual characteristics of breech face impressions in ballistic identification using optimal Gaussian filter parameters; pp. 519–523. [DOI] [Google Scholar]
- 27.Ghani N.A.M., Liong C.Y., Jemain A.A. Analysis of geometric moments as features for firearm identification. Forensic Sci. Int. 2010;198:143–149. doi: 10.1016/j.forsciint.2010.02.011. [DOI] [PubMed] [Google Scholar]
- 28.Chuan Z.L., Jemain A.A., Liong C.Y., Ghani N.A.M., Tan L.K. A robust firearm identification algorithm of forensic ballistics specimens. J. Phys. Conf. 2017;890 doi: 10.1088/1742-6596/890/1/012126. [DOI] [Google Scholar]
- 29.Leng J., Huang Z. On analysis of circle moments and texture features for cartridge images recognition. Expert Syst. Appl. 2012;39:2092–2101. doi: 10.1016/j.eswa.2011.08.003. [DOI] [Google Scholar]
- 30.Fischer R., Vielhauer C. Proceedings of the 2nd ACM Workshop on Information Hiding and Multimedia Security. IH&MMSec ’14); 2014. Digital crime scene analysis: automatic matching of firing pin impressions on cartridge bottoms using 2D and 3D spatial features; pp. 77–82. [DOI] [Google Scholar]
- 31.Fischer R., Vielhauer C. Proceedings of the 3rd ACM Workshop on Information Hiding and Multimedia Security. IH&MMSec ’15); 2015. Automated firearm identification: on using a novel multiple-slice-shape (MSS) approach for comparison and matching of firing pin impression topography; pp. 161–171. [DOI] [Google Scholar]
- 32.Morris K.B., Law E.F., Jefferys R.L., Dearth E.C. Interpretation of cartridge case evidence using IBIS and Bayesian networks. 2016. https://www.ncjrs.gov/App/AbstractDB/AbstractDBDetails.aspx?id=272547 Report on research conducted under Cooperative Agreement Number W911NF-12-2-0056.
- 33.Li D.G. Proceedings of the 2006 5th IEEE International Conference on Cognitive Informatics. 2006. A new approach for firearm identification with hierarchical neural networks based on cartridge case images; pp. 923–928. [DOI] [Google Scholar]
- 34.Ghani N.A.M., Liong C.Y., Jemain A.A. Neurocomputing approach for firearm identification. Pertanika J. Sci. Technol. 2018;26:341–352. http://www.pertanika.upm.edu.my/pjst/browse/regular-issue?article=JST-S0297-2017 [Google Scholar]
- 35.Giudice O., Guarnera L., Paratore A.B., Farinella G.M., Battiato S. Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP) 2019. Siamese ballistics neural network; pp. 4045–4049. [DOI] [Google Scholar]
- 36.Razak N.A., Liong C.-Y., Jemain A.A., Ghani N.A.M., Zakaria S. Automatic firing pin impression identification based on feature fusion of fractal dimension and geometric moment. J. Telecommun. Electron. Comput. Eng. 2020;12(2):7–10. https://jtec.utem.edu.my/jtec/article/view/5823 [Google Scholar]
- 37.Roth J., Carriveau A., Liu X., Jain A.K. Proceedings of the 2015 IEEE 7th International Conference on Biometrics Theory, Applications and Systems (BTAS) 2015. Learning-based ballistic breech face impression image matching. [DOI] [Google Scholar]
- 38.Song J. Proposed “congruent matching cells (CMC)” method for ballistic identification and basic concepts valid and invalid correlation region. Assoc. Firearm Tool mark Exam. J. 2015;47(3):177–185. https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=911193 [Google Scholar]
- 39.Tai X.H., Eddy W.F. A fully automatic method for comparing cartridge case images. J. Forensic Sci. 2018;63:440–448. doi: 10.1111/1556-4029.13577. [DOI] [PubMed] [Google Scholar]
- 40.Zhang N.F. The use of correlated binomial distribution in estimating error rates for firearm evidence identification. J. Res. Natl. Inst. Stand. Technol. 2019;124 doi: 10.6028/jres.124.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Song J., Vorburger T.V., Ma L., Libert J.M., Ballou S.M. Proceedings of the American Society for Precision Engineering. 2005. A metric for the comparison of surface topographies of standard reference material (SRM) bullets and casings.https://www.nist.gov/publications/metric-comparison-surface-topographies-standard-reference-material-srm-bullets-and (ASPE) [Google Scholar]
- 42.Zhang H., Song J., Tong M., Chu W. Correlation of firing pin impressions based on congruent matching cross-sections (CMX) method. Forensic Sci. Int. 2016;263:186–193. doi: 10.1016/j.forsciint.2016.04.015. [DOI] [PubMed] [Google Scholar]
- 43.Zhang H., Zhu J., Hong R., Wang H., Sun F., Malik A. Convergence-improved congruent matching cells (CMC) method for firing pin impression comparison. J. Forensic Sci. 2021;66:571–582. doi: 10.1111/1556-4029.14634. [DOI] [PubMed] [Google Scholar]
- 44.Chen Z., Song J., Chu W., Soons J.A., Zhao X. A convergence algorithm for correlation of breech face images based on the congruent matching cells (CMC) method. Forensic Sci. Int. 2017;280:213–223. doi: 10.1016/j.forsciint.2017.08.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Tong M., Pan Y., Li Z., Lin W. Valid data based normalized cross-correlation (VDNCC) for topography identification. Neurocomputing. 2018;308:184–193. doi: 10.1016/j.neucom.2018.04.059. [DOI] [Google Scholar]
- 46.Tong M., Yu X., Huang S. Automatic identification of firing pin impressions based on the Congruent Matching Cell (CMC) method. Neurocomputing. 2019;367:246–258. doi: 10.1016/j.neucom.2019.08.033. [DOI] [Google Scholar]
- 47.Riva F., Champod C. Automatic comparison and evaluation of impressions left by a firearm on fired cartridge cases. J. Forensic Sci. 2014;59:637–647. doi: 10.1111/1556-4029.12382. [DOI] [PubMed] [Google Scholar]
- 48.Dong F., Zhao Y., Luo Y., Zhang W., Zhang K. Specificity of characteristic marks on cartridge cases from 3070 consecutive firings of a Chinese Norinco QSZ-92 9 mm Pistol. J. Forensic Sci. Med. 2019;5(2):87–94. doi: 10.4103/jfsm.jfsm_6_19. [DOI] [Google Scholar]
- 49.Riva F., Mattijssen E.J.A.T., Hermsen R., Pieper P., Kerkhoff W., Champod C. Comparison and interpretation of impressed marks left by a firearm on cartridge cases – towards an operational implementation of a likelihood ratio based technique. Forensic Sci. Int. 2020;313 doi: 10.1016/j.forsciint.2020.110363. [DOI] [PubMed] [Google Scholar]
- 50.Song J., Chen Z., Vorburger T.V., Soons J.A. Evaluating likelihood ratio (LR) for firearm evidence identifications in forensic science based on the Congruent Matching Cells (CMC) method. Forensic Sci. Int. 2020;317 doi: 10.1016/j.forsciint.2020.110502. [DOI] [PubMed] [Google Scholar]
- 51.Morrison G.S., Enzinger E. Score based procedures for the calculation of forensic likelihood ratios – scores should take account of both similarity and typicality. Sci. Justice. 2018;58:47–58. doi: 10.1016/j.scijus.2017.06.005. [DOI] [PubMed] [Google Scholar]
- 52.Neumann C., Ausdemore M. Defence against the modern arts: the curse of statistics – part ⅠI: ‘Score-based likelihood ratios. Law Probab. Risk. 2020;19:21–42. doi: 10.1093/lpr/mgaa006. [DOI] [Google Scholar]
- 53.Neumann C., Hendricks J., Ausdemore M. In: Handbook of Forensic Statistics. Banks D., Kafadar K., Kaye D.H., Tackett M., editors. CRC; Boca Raton, FL: 2020. Statistical support for conclusions in fingerprint examinations; pp. 277–324. [DOI] [Google Scholar]
- 54.Ommen D.M., Saunders C.P. A problem in forensic science highlighting the differences between the Bayes factor and likelihood ratio. Stat. Sci. 2021;36:344–359. doi: 10.1214/20-STS805. [DOI] [Google Scholar]
- 55.Wang Y. Class characteristic classification of test fired cartridge cases: a digital image decision tree approach to Kensington’s matrix for initial stages of criminal investigation. J. Forensic Sci. Crim. Invest. 2017;6 doi: 10.19080/JFSCI.2017.06.555693. [DOI] [Google Scholar]
- 56.Morrison G.S., Enzinger E., Ramos D., González-Rodríguez J., Lozano-Díez A. In: Handbook of Forensic Statistics. Banks D., Kafadar K., Kaye D.H., Tackett M., editors. CRC; Boca Raton, FL: 2020. Statistical models in forensic voice comparison; pp. 451–497. [DOI] [Google Scholar]
- 57.Morrison G.S., Weber P., Enzinger E., Labrador B., Lozano-Díez A., Ramos D., González-Rodríguez J. In: Encyclopedia of Forensic Sciences. third ed. Houck M., Wilson L., Lewis S., Eldridge H., Reedy P., Lothridge K., editors. Elsevier; 2022. Forensic voice comparison – human-supervised-automatic approach.https://www.elsevier.com/books/encyclopedia-of-forensic-sciences/houck/978-0-12-823677-2http://forensic-voice-comparison.net/encyclopedia/ available at, In press, A preprint is available at: [Google Scholar]
- 58.Weber P., Enzinger E., Labrador B., Lozano-Díez A., Ramos D., González-Rodríguez J., Morrison G.S. Validation of the alpha version of the E3 Forensic Speech Science System (E3FS3) core software tools. Forensic Sci. Int.: Synergy. 2022;4 doi: 10.1016/j.fsisyn.2022.100223. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Zernike F. Beugungstheorie des schneidenver-fahrens und seiner verbesserten form, der phasenkontrastmethode. Physica. 1934;1:689–704. doi: 10.1016/S0031-8914(34)80259-5. [DOI] [Google Scholar]
- 60.Teague M.R. Image analysis via the general theory of moments. J. Opt. Soc. Am. 1980;70:920–930. doi: 10.1364/JOSA.70.000920. [DOI] [Google Scholar]
- 61.Khotanzad A., Hong Y.H. Invariant image recognition by Zernike moments. IEEE Trans. Pattern Anal. Mach. Intell. 1990;12(5):489–497. doi: 10.1109/34.55109. [DOI] [Google Scholar]
- 62.Iskander D.R., Collins M.J., Davis B. Optimal modeling of corneal surfaces with Zernike polynomials. IEEE (Inst. Electr. Electron. Eng.) Trans. Biomed. Eng. 2001;48:87–95. doi: 10.1109/10.900255. [DOI] [PubMed] [Google Scholar]
- 63.Sun M., Birkenfeld J., de Castro A., Ortiz S., Marcos S. OCT 3-D surface topography of isolated human crystalline lenses. Biomed. Opt Express. 2014;5:3547–3561. doi: 10.1364/BOE.5.003547. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Pinhasi S.V., Alimi R., Eliezer S., Perelmutter L. Fast optical computerized topography. Phys. Lett. 2010;374:2798–2800. doi: 10.1016/j.physleta.2010.04.05. [DOI] [Google Scholar]
- 65.Vretos N., Nikolaidis N., Pitas I. Proceedings of the 18th IEEE International Conference on Image Processing. 2011. 3D facial expression recognition using Zernike moments on depth images; pp. 773–776. [DOI] [Google Scholar]
- 66.Teh C., Chin R.T. On image analysis by the methods of moments. IEEE Trans. Pattern Anal. Mach. Intell. 1988;10:496–513. doi: 10.1109/34.3913. [DOI] [Google Scholar]
- 67.Belkasim S.O., Shridhar M., Ahmadi M. Pattern recognition with moment invariants: a comparative study and new results. Pattern Recogn. 1991;24:1117–1138. doi: 10.1016/0031-3203(91)90140-Z. [DOI] [Google Scholar]
- 68.Hu M.-K. Visual pattern recognition by moment invariants. IEEE Trans. Inf. Theor. 1962;8(2):179–187. doi: 10.1109/TIT.1962.1057692. [DOI] [Google Scholar]
- 69.Flusser J. On the independence of rotation moment invariants. Pattern Recogn. 2000;33:1405–1410. doi: 10.1016/S0031-3203(99)00127-2. [DOI] [Google Scholar]
- 70.Flusser J., Suk T. Rotation moment invariants for recognition of symmetric objects. IEEE Trans. Image Process. 2006;15:3784–3790. doi: 10.1109/TIP.2006.884913. [DOI] [PubMed] [Google Scholar]
- 71.Pearson K. On lines and planes of closest fit to systems of points in space. Lond. Edinb.Dublin Phil. Mag. J. Sci. 1901;2:559–572. doi: 10.1080/14786440109462720. [DOI] [Google Scholar]
- 72.Hotelling H. Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 1933;24(6):417–441. doi: 10.1037/h0071325. [DOI] [Google Scholar]
- 73.Fisher R.A. The use of multiple measurements in taxonomic problems. Ann. Eug. 1936;7:179–188. doi: 10.1111/j.1469-1809.1936.tb02137.x. [DOI] [Google Scholar]
- 74.Rao C.R. The utilization of multiple measurements in problems of biological classification. J. Roy. Stat. Soc. B. 1948;10:159–203. http://www.jstor.org/stable/2983775 [Google Scholar]
- 75.Prince S.J.D., Elder J.H. Proceedings of the IEEE 11th International Conference on Computer Vision. 2007. Probabilistic linear discriminant analysis for inferences about identity; pp. 1–8. [DOI] [Google Scholar]
- 76.Kenny P. Proceedings of Odyssey 2010: the Speaker and Language Recognition Workshop. 2010. Bayesian speaker verification with heavy tailed priors.https://www.isca-speech.org/archive_open/odyssey_2010/od10_014.html paper 014. [Google Scholar]
- 77.Brümmer N., de Villiers E. Proceedings of Odyssey 2010: the Speaker and Language Recognition Workshop. 2010. The speaker partitioning problem; pp. 194–201.https://www.isca-speech.org/archive_open/odyssey_2010/od10_034.html [Google Scholar]
- 78.Sizov A., Lee K.A., Kinnunen T. In: Structural, Syntactic, and Statistical Pattern Recognition. Fränti P., Brown G., Loog M., Escolano F., Pelillo M., editors. Springer; Berlin: 2014. Unifying probabilistic linear discriminant analysis variants in biometric authentication; pp. 464–475. [DOI] [Google Scholar]
- 79.Aitken C.G.G., Lucy D. Evaluation of trace evidence in the form of multivariate data. Appl. Stat. 2004;53:109–122. doi: 10.1046/j.0035-9254.2003.05271.x. doi: 10.1046/j.0035-9254.2003.05271.x. [Corrigendum: (2004) 53, 665–666.] [DOI] [Google Scholar]
- 80.González-Rodríguez J., Rose P., Ramos D., Toledano D.T., Ortega-García J. Emulating DNA: rigorous quantification of evidential weight in transparent and testable forensic speaker recognition. IEEE Trans. Speech Audio Process. 2007;15:2104–2115. doi: 10.1109/TASL.2007.902747. [DOI] [Google Scholar]
- 81.Morrison G.S. Tutorial on logistic-regression calibration and fusion: converting a score to a likelihood ratio. Aust. J. Forensic Sci. 2013;45:173–197. doi: 10.1080/00450618.2012.733025. [DOI] [Google Scholar]
- 82.Morrison G.S., Poh N. Avoiding overstating the strength of forensic evidence: shrunk likelihood ratios/Bayes factors. Sci. Justice. 2018;58:200–218. doi: 10.1016/j.scijus.2017.12.005. [DOI] [PubMed] [Google Scholar]
- 83.Hestenes M.R., Stiefel E. Methods of conjugate gradients for solving linear systems. J. Res. Natl. Bur. Stand. 1952;49:409–436. doi: 10.6028/jres.049.044. [DOI] [Google Scholar]
- 84.Minka T.P. A comparison of numerical optimizers for logistic regression. 2003. https://tminka.github.io/papers/logreg/ Technical report.
- 85.Morrison G.S., Enzinger E., Hughes V., Jessen M., Meuwly D., Neumann C., Planting S., Thompson W.C., van der Vloed D., Ypma R.J.F., Zhang C., Anonymous A., Anonymous B. Consensus on validation of forensic voice comparison. Sci. Justice. 2021;61:229–309. doi: 10.1016/j.scijus.2021.02.002. [DOI] [PubMed] [Google Scholar]
- 86.Brümmer N., du Preez J. Application independent evaluation of speaker detection. Comput. Speech Lang. 2006;20:230–275. doi: 10.1016/j.csl.2005.08.001. [DOI] [Google Scholar]
- 87.Meuwly D. Doctoral dissertation, University of Lausanne; 2001. Reconnaissance de locuteurs en sciences forensiques: l’apport d’une approche automatique.https://www.unil.ch/files/live/sites/esc/files/shared/These.Meuwly.pdf [Google Scholar]









