Skip to main content
Springer logoLink to Springer
. 2022 Oct 7;137(2):471–485. doi: 10.1007/s00414-022-02899-7

Machine learning and discriminant function analysis in the formulation of generic models for sex prediction using patella measurements

Mubarak A Bidmos 1,, Oladiran I Olateju 2, Sabiha Latiff 2, Tawsifur Rahman 3, Muhammad E H Chowdhury 3
PMCID: PMC9902304  PMID: 36205796

Abstract

Sex prediction from bone measurements that display sexual dimorphism is one of the most important aspects of forensic anthropology. Some bones like the skull and pelvis display distinct morphological traits that are based on shape. These morphological traits which are sexually dimorphic across different population groups have been shown to provide an acceptably high degree of accuracy in the prediction of sex. A sample of 100 patella of Mixed Ancestry South Africans (MASA) was collected from the Dart collection. Six parameters: maximum height (maxh), maximum breadth (maxw), maximum thickness (maxt), the height of articular facet (haf), lateral articular facet breadth (lafb), and medial articular facet breath (mafb) were used in this study. Stepwise and direct discriminant function analyses were performed for measurements that exhibited significant differences between male and female mean measurements, and the “leave-one-out” approach was used for validation. Moreover, we have used eight classical machine learning techniques along with feature ranking techniques to identify the best feature combinations for sex prediction. A stacking machine learning technique was trained and validated to classify the sex of the subject. Here, we have used the top performing three ML classifiers as base learners and the predictions of these models were used as inputs to different machine learning classifiers as meta learners to make the final decision. The measurements of the patella of South Africans are sexually dimorphic and this observation is consistent with previous studies on the patella of different countries. The range of average accuracies obtained for pooled multivariate discriminant function equations is 81.9–84.2%, while the stacking ML technique provides 90.8% accuracy which compares well with those presented for previous studies in other parts of the world. In conclusion, the models proposed in this study from measurements of the patella of different population groups in South Africa are useful resent with reasonably high average accuracies.

Keywords: Forensic anthropology, Sex prediction, Patella, Discriminant function analyses, Machine learning

Introduction

Prediction of sex from recovered or discovered bones in human identification is an important first step taken by forensic anthropologists to reduce the number of possible matches by 50% [1]. This process, in conjunction with the estimation of age, stature and population affinity, is essential in the establishment of the identity of an individual from skeletons. Some bones like the skull and pelvis display distinct morphological traits that are based on shape. These morphological traits which are sexually dimorphic across different population groups have been shown to provide an acceptably high degree of accuracy in the estimation of sex [2]. While most earlier researchers and some lately [3] have focused on the use of description of the observed morphological traits, considered to be a subjective method which requires many years of experience, attention has been shifted recently to the quantification of the differences in shape that are observed on bones. These quantifications can be performed objectively using various morphometric techniques such as including and not limited to geometric morphometrics [411].

In the absence of the pelvis and the skull which display obvious morphological differences between males and females, size differences which are present in most bones of the postcranial skeleton can also be used for sex prediction. This metrical approach can also be used on incomplete or fragmentary remains. Standard measured parameters of different bones of the skeleton which can be easily reproducible have been analyzed through the use of various statistical methods including and not limited to logistic regression and discriminant function in different population groups. It is a well-established fact that these equations are population-specific and as such should be limited in their application to only population groups for which they were formulated to obtain acceptably high classification average accuracies. This has led to the generation of population-specific discriminant function and logistic regression equations for measurements of the skull [1217], bones of the vertebral column [18, 19], pelvis [2022], long bones of the upper [2329] and lower extremities [3036], and hand and foot bones [3740] in different population groups of the world with acceptably high classification rates.

Similar population-specific local standards have also been established in South Africa for the prediction of sex from dimensions of the skull [37, 38]) and postcranial bones [3946]. Most of these equations have been derived from data collected from samples of bones of South Africans of European and African descent, which are housed mainly in the Raymond Dart collection of human skeletons [47], Pretoria bone collection [48], and UCT osteological collection [49]. Recently, successful attempts have also been made to formulate population-specific equations for Mixed-Ancestry South Africans or colored [43, 5052]. While discriminant function analysis has been widely used in sex estimation using various bones of the human skeleton in South Africa, no previous attempts have utilized other novel techniques such as machine learning algorithm for that purpose.

Over the past decade, machine learning (ML) algorithms have become increasingly integrated into clinical predictive modeling, e.g., in prognostic models using health data [5355]. Recent reviews have also highlighted the high interest in ML approaches for clinical guidance, as well as the necessity for more prognostic studies [56]. While there is a significant rise in interest in ML in health care, only a few studies have evaluated its capability of outperforming conventional statistical models (CSMs) in terms of predictability or not. ML rapidly examines continuously expanding datasets and enables the identification of patterns and trends that may not be directly visible to clinicians [57]. Other advantages of ML are its flexibility, it is nonparametric, requires no data model for the probability distribution of the outcome variable, requires no pre‐specification of covariates, and it can process large input variables simultaneously [58, 59]. In a clinical context of predicting mortality from gastrointestinal bleeding, a systematic review demonstrated higher c‐indices and predictive capacity of ML than clinical risk scores [60]. Another study aimed at predicting bleeding risk following percutaneous coronary intervention found that ML characterized bleeding risk better than a standard discriminant analysis model [61]. Likewise, ML vs CSMs using the TOPCAT trial dataset showed that ML methods presented higher c‐indices than CSMs for readmission (0.76 vs 0.73) and predicting mortality (0.72 vs 0.66) [62].

Osteometric variations between population groups have necessitated the need to propose population specificity standards for human identification. In addition, each of these groups exhibits and expresses sexual dimorphism to various degrees. Some authors have observed major flaws in the development and application of population-specific standards for the prediction of sex [63] and stature [64, 65] leading to the proposal and recommendation for use of generic equations. This study thus aims to (1) formulate generic models for sex prediction using measurements of the patella of South Africans of African (SAAD) and European descent (SAED), as well as Mixed Ancestry South Africans (MASA) (2), and compare the classification rates obtained from the generic models using linear discriminant analysis with those obtained from machine learning algorithms.

Materials and methods

The Human Research Ethics Committee (Medical) of the University of the Witwatersrand, Johannesburg, South Africa granted an ethical clearance waiver (Ethics Waiver Number: W-CJ-140604–1) before the commencement of this study. Data were collected from a sample of patella of Mixed Ancestry South Africans (MASA). Additional data analyzed in the current study were obtained from previously published studies on sexual dimorphism of measurements of patella of South Africans of European descent (SAED) [46] and South Africans of African descent (SAAD) [66]. The sample distribution for the data used in the current study is as follows: SAAD (50 males and 50 females), SAED (50 males and 50 females), and MASA (30 males and 30 females). The birth dates range from 1999 to 2017. The source of data was the Raymond A. Dart collection of human skeletons, considered one of the largest collections of human skeletons in the world [47]. It is located in the School of Anatomical Sciences of the University of the Witwatersrand, Johannesburg, South Africa. The patella belonged to individuals whose age at death ranged between 25 and 79 years and whose birth years were between 1928 and 1991. Patella with any pathological features like osteophytic lipping, lesions, or any other obvious deformities were excluded from this study. Six parameters were measured on each patella. These are maximum height (maxh), maximum breadth (maxw), maximum thickness (maxt), the height of articular facet (haf), lateral articular facet breadth (lafb), and medial articular facet breath (mafb). These measurements have been described in previous studies [46] and are illustrated in Fig. 1. Lin’s [67] concordance correlation coefficient of reproducibility was used for the assessment of intraobserver error. It has been shown that this method assesses the agreement between the test and retest measurement and is considered as a measure of prevision of the measuring technique.

Fig. 1.

Fig. 1

Illustration of measurements of the patella

Statistical analysis

Statistical analysis was performed using the Stata/MP 13.0 software. SPSS version 23 software program was used for the linear discriminant analysis of data. Sex differences were described using numbers and percentages. The number of missing data, mean, standard deviations, median, and quartiles (Q1, Q3) for combined data for all measurements from SAED, SAAD, and MASA were calculated separately for each sex. In univariate analysis, the Rank sum tests were used and performed for all variables. A statistically significant difference was defined as a P value < 0.05.

Discriminant function analysis

Stepwise and direct discriminant function analyses were performed for measurements that exhibited significant sex differences. The “leave-one-out” classification procedure was then used to evaluate the validity of the functions. In this procedure, each case in the sample is classified using the function that is generated without it. Then, generic stepwise and direct discriminant functions with acceptably high average accuracies were selected. Each of the generic functions selected was used to predict sex for each case in samples of SAED, SAAD, and MASA. The average accuracies in correct sex prediction for each of the functions were calculated for each population group separately.

Machine learning-based analysis

Six patella measurements were present in the dataset that were evaluated to determine the Pearson correlation among them. Figure 2 shows the heatmap of correlation, and it was found that none of them is highly correlated to the other. A maximum correlation of 0.81 was found between maxb and lafb. However, the threshold of removing highly correlated features was considered r > 0.85 and therefore, none of the features was removed for the next phase of the investigation.

Fig. 2.

Fig. 2

Heatmap of Pearson correlation among different features

Data normalization

The accuracy of the machine learning models is dependent on the quality of the input data for achieving generalized performance. This involves data normalization that entails scaling or transforming the data to make each selection contribute equally during the training process. The performance enhancement of the machine learning models employing such has been verified by many studies [68]. In this study, Z-score normalization was utilized due to its sensitivity to outliers. The formula for Z-score normalization as shown in Eq. (1) is:

v=v-μvσv 1

where v, v, μv, and σv denote the new value, original value, mean, and standard deviation of the variable values in the training samples, respectively. This method transforms the data with a mean of 0 and a standard deviation of 1.

Top-ranked features identification

The feature selection technique automatically selects those features which are most significant for output prediction. This method thus helps in reducing overfitting and training time as well as improving accuracy. Several different feature selection techniques, e.g., univariate selection, recursive feature elimination (RFE), principal component analysis (PCA), bagged decision trees like random forest and extra trees, and boosted trees like Extreme Gradient Boosting (XGBoost) etc. have been used in the literature. However, the present study investigated and compared three feature selection techniques: (1) XGBoost [69], (2) Extra tree [70], and (3) Random Forest [71, 72] to determine the best feature combinations for sex prediction using different ML classifiers.

Model development

The present used and compared different machine learning classifiers such as Gradient boosting [69], XGBoost [73], Extra tree [73], K-nearest neighbour (KNN) [73], Adaboost [73], Random Forest [73], linear discriminant analysis (LDA) [71, 72], and Logistic regression [74] using the best feature combination which was identified by the feature selection techniques for sex prediction. Then we investigated a stacking approach where a combination of base learners and meta learners was used to classify the sex of the subject. Here, we have used the top performing three ML classifiers as base learners and the predictions of these models were used as inputs to different machine learning classifiers as meta learners to make the final decision. Eight different machine learning classifiers as meta learners in the stacking approach to find the best performing classifier were investigated.

If a single dataset A, which consists of feature vectors (xi) and their classification probability score is yi. At first, a set of base-level classifiers M1,,Mp is generated and the outputs are used to train the meta-level classifier as illustrated in Fig. 3.

Fig. 3.

Fig. 3

Stacking model architecture

Five-fold cross-validation was used to create a training set for the meta-level classifier. Among these folds, base-level classifiers were trained on four-folds, leaving one fold for validation. Each base-level classifier predicts a probability distribution over the possible class values. Thus, using input x, a probability distribution is created using the predictions of the base-level classifier set, M:

PM(x)=(PMc1|x,PMc2|x,.,PMcn|x) 2

where (c1,c2,,cn) is the set of possible class values and PMci|x denotes the probability that example x belongs to a class cj as estimated (and predicted) by classifier M in Eq. 2. The class, ci with the highest-class probability, PMjci|x is predicted by the classifier, M. The meta-level classifier Mf and attributes are thus the probabilities predicted for each possible class by each of the base-level classifiers, i.e., PMjci|x for i = 1,…., n, and j = 1,…., p. The pseudo-code for the stacking approach is shown in Algorithm 1.

Algorithm 1.

Algorithm 1

Stacking classification

Performance metrics

Different classification models were compared using the top-ranked features from the testing data to calculate the performance matrices in classifying male and female classes. The best performing classifier was evaluated for different combinations of features as input to the model by calculating the receiver operating characteristic (ROC)—area under the curve (AUC) and performance metrics such as accuracy, precision, sensitivity, specificity, and F1-Score as shown in Eqs. (37). Different classification algorithms and different features’ combinations of the best performing algorithm were validated using fivefold cross-validation where training and testing were done on 80% and 20% of data, respectively, and this process was repeated 5-times to test the entire dataset. Weighted average within 95% confidence interval was calculated for sensitivity, specificity, precision, F1-score, and overall accuracy from the confusion matrix that accumulates all test (unseen) fold results of the fivefold cross-validation. The correct estimation of a male subject is true positive (TP), and the correct estimation of the female subject is true negative (TN). The incorrect estimation of the male subject as female is false negative (FN) and the incorrect estimation of the female subject as male is false positive (FP)

Accuracy=TP+TNTP+TN+FP+FN 3
Precision=TPTP+FP 4
Sensitivity=TPTP+FN 5
Specificity=TNTN+FP 6
F1-score=2Precision×SensitivityPrecision+Sensitivity 7

Results

The values of Lin’s concordance correlation coefficient of reproducibility ranges between 0.974 and 0.998 (Table 1). These values fell within the recommended range from 0.90 to 0.99 which indicates that all patella measurements are easily reproducible and the subsequent data analyzed in this study are not significantly affected by measurement error. For clarity, the analyses on discriminant function and machine learning are presented separately. In the first section, results from descriptive statistics, univariate and multivariant discriminant function analysis are presented while in the second section, best feature selection, validation of ML model and stacking technique are reported.

Table 1.

Table of concordance correlation coefficients of reproducibilty (Pc)

Measurement Pc
MAXT 0.994
MAXB 0.998
MAXH 0.997
LAFB 0.994
HAF 0.974
MAFB 0998

Discriminant function analysis

The descriptive statistics of all measurements for pooled data are shown in Table 2. The male showed significantly higher (p ≤ 0.05) mean measurements for all measures than the female. All patella measurements were subjected to stepwise and direct discriminant function analyses. The unstandardized coefficients, constants, average accuracies, cross-validation in correct sex classification, and the sectioning points for individual measurements are shown in Table 3. The best performing variable, maxh, presented with an acceptably high average accuracy of 82% (Table 3) while the other variables presented with low average accuracies which ranged between 69 for lafb and 79% for maxb (Table 3).

Table 2.

Characteristics of the study subjects and univariate analysis

Item Male Female Total Method Statistic P value

Maxh

• N (missing)

• Mean ± SD

• Median

• Q1, Q3

• Min, max

130 (0)

42.07 ± 3.14

42.15

40, 44

32.49, 51

130 (0)

37.25 ± 2.8

37

35.2, 38.9

31, 52

260 (0)

39.66 ± 3.8

39

36.8, 42.5

31, 52

Rank-sum test Z =  − 13.11  < 0.05

Maxb

• N(missing)

• Mean ± SD

• Median

• Q1, Q3

• Min, max

130 (0)

43.77 ± 3.28

44

41.5, 46

34.84, 52

130 (0)

38.85 ± 3.23

38.9

36.82, 40.7

31.73, 51

260(0)

41.3 ± 4.1

41

38.13, 44.2 31.73, 52

Rank-sum test Z =  − 18.26  < 0.05

maxt

• N(missing)

• Mean ± SD

• Median

• Q1, Q3

• Min, max

130 (0)

20.38 ± 1.62

20.35

19.15, 21.5

15.7, 25.4

130 (0)

18.13 ± 1.75

18

17, 19.5

14.5, 24

260 (0)

19.26 ± 2.02

19.17

18, 20.73

14.5, 25.4

Rank-sum test Z =  − 9.6  < 0.05

haf

• N(missing)

• Mean ± SD

• Median

• Q1, Q3

• Min, max

130 (0)

30.26 ± 2.65

30

28.72, 32

23.3, 37

130 (0)

27.42 ± 2.8

27

25.7, 29

22, 39

260 (0)

28.85 ± 3.06

29

26.5, 30.77

22, 39

Rank-sum test Z =  − 11.08  < 0.05

mafb

• N(missing)

• Mean ± SD

• Median

• Q1, Q3

• Min, max

130 (0)

19.7 ± 2.02

20

18.05, 21

14.9, 24

130 (0)

17.44 ± 2.23

17.3

16, 18.47

12.3, 24.92

260 (0)

18.57 ± 2.4

18.2

17, 20.4

12.3, 24.92

Rank-sum test Z =  − 14.77  < 0.05

lafb

• N(missing)

• Mean ± SD

• Median

• Q1, Q3

• Min, max

130 (0)

26.34 ± 2.7

26

24.6, 28

20.8, 38

130 (0)

23.4 ± 2.6

23

21.71, 25

16.92, 32

260 (0)

24.87 ± 3.02

25

22.57, 26.98

16.92, 38

Rank-sum test Z =  − 11.34  < 0.05
Outcome (%) 130 (50%) 130 (50%) 260

Table 3.

Univariate discriminant function analysis

Variable Unstandardized coefficient Constant Average (O) Average (C)
MAXH 0.336  − 13.320 81.5 (M = 79.2, F = 83.8) 81.5 (M = 79.2, F = 83.8)
MAXB 0.307  − 12.675 78.5 (M = 76.9, F = 80.0) 78.1 (M = 76.9, F = 79.2)
MAXT 0.593  − 11.420 74.6 (M = 73.1, F = 76.2) 74.6 (M = 73.1, F = 76.2)
MAFB 0.471  − 8.749 72.7 (M = 70.0, F = 75.4) 72.7 (M = 70.0, F = 75.4)
HAF 0.368  − 10.602 71.9 (M = 73.8, F = 70.0) 71.9 (M = 73.8, F = 70.0)
LAFB 0.377  − 9.374 69.2 (M = 71.5, F = 66.9) 69.2 (M = 71.5, F = 66.9)

Discriminant function equation (y) = unstandardized coefficient x variable + constant

Sectioning point is 0: O original classification rate before cross validation; C classification rate after cross validation

Table 4 shows the stepwise and direct discriminant function analysis using various combinations of measurements. In the stepwise analysis, four measurements were selected, namely, maxh, maxb, maxt, and haf. The discriminant function equation derived from these measurements provided an average accuracy of 84.2% as shown in Table 4. The other functions in Table 4 were formulated using direct discriminant function analysis of patella measurements. The average accuracies in correct sex classification ranged between 81.9 (Function D5, Table 4) and 83.5% (Function D1, Table 4). The results of the cross-validation using the leave-one-out classification showed that the average accuracy in correct sex classification for most of the presented functions remained unchanged (Table 4). Functions D2 and D4 showed a minimal and insignificant drop in classification rate of 0.8% thereby confirming the validity of the derived functions from the pooled data.

Table 4.

Multivariate discriminant function analysis

Variables Unstandardized coefficient Average accuracies (%)
O C
Stepwise maxh 0.163 84.2 (M = 83.8, F = 84.6) 84.2 (M = 83.8, F = 84.6)
maxb 0.09
maxt 0.144
haf 0.099
Constant  − 15.860
Direct
D1 maxh 0.213 83.5 (M = 80.0, F = 86.9) 83.5 (M = 80.0, F = 86.9)
maxb 0.146
Constant  − 14.476
D2 maxh 0.188 83.5 (M = 80.8, F = 86.2) 82.7 (M = 79.2, F = 86.2)
maxb 0.110
maxt 0.149
Constant  − 14.849
D3 lafb -0.021 83.1 (M = 81.5, F = 84.6) 83.1 (M = 81.5, F = 84.6)
maxb 0.087
maxh 0.158
maxt 0.136
haf 0.097
mafb 0.066
Constant  − 15.969
D4 maxh 0.248 83.1 (M = 83.1, F = 83.1) 82.3 (M = 81.5, F = 83.1)
maxt 0.233
Constant  − 14.322
D5 maxh 0.294 81.9 (M = 79.2, F = 84.6) 81.9 (M = 79.2, F = 84.6)
lafb 0.076
Constant  − 13.568

O original classification rate before cross validation; C classification rate after cross validation

Machine learning analysis

Best feature combination for sex prediction

In this study, three feature ranking algorithms were used to identify top-ranked features among all features. These top-ranked features were investigated with 8 different classifiers which were performed with Top-1 to Top-6 features to identify the best performing classification model and best feature combination simultaneously for sex prediction. It was observed that RF and ET feature selection techniques produced the same feature ranking while the XGBoost feature selection algorithm produced different rankings as shown in Fig. 4.

Fig. 4.

Fig. 4

Top-ranked features using different feature selection techniques; A XGBoost, B random forest, and C Extra Tree algorithms

In this study, Top-3 features (maxh, maxb, and maxt) using a random forest (RF) feature selection algorithm with random forest machine learning (ML) classifier outperformed other classifiers. Table 5 shows the overall accuracies and weighted average performance for the other matrices (precision, sensitivity, specificity, and F1-score) with a 95% confidence interval to identify the best feature combinations using Top 1 to 6 features for fivefold cross-validation using the best classifier (AdaBoost classifier for XGBoost feature selection and RF classifier for RF and ET feature selection algorithms).

Table 5.

Comparison of the average performance metrics from five-fold cross-validation for top-ranked features using the best performing ML classifier

Feature selection Best ML classifier Features Weighted figures with 95% CI
Overall accuracy Precision Sensitivity Specificity F1-score
XGBoost AdaBoost Top-1 75.01 ± 5.26 74.88 ± 5.27 75.1 ± 5.26 75 ± 5.26 74.8 ± 5.28
Top-2 86.92 ± 4.1 86.93 ± 4.1 86.92 ± 4.1 86.92 ± 4.1 86.92 ± 4.1
Top-3 88.08 ± 3.94 88.19 ± 3.92 88.08 ± 3.94 88.08 ± 3.94 88.07 ± 3.94
Top-4 88.85 ± 3.83 88.87 ± 3.82 88.85 ± 3.83 88.85 ± 3.83 88.84 ± 3.83
Top-5 87.69 ± 3.99 87.73 ± 3.99 87.69 ± 3.99 87.69 ± 3.99 87.69 ± 3.99
Top-6 89.23 ± 3.77 89.31 ± 3.76 89.23 ± 3.77 89.23 ± 3.77 89.23 ± 3.77
RF and ET RF Top-1 80.77 ± 4.79 80.8 ± 4.79 80.77 ± 4.79 80.77 ± 4.79 80.76 ± 4.79
Top-2 84.23 ± 4.43 84.28 ± 4.42 84.23 ± 4.43 84.23 ± 4.43 80.76 ± 4.79
Top-3 89.23 ± 3.77 88.64 ± 3.86 90 ± 3.65 88.46 ± 3.88 89.31 ± 3.76
Top-4 87.31 ± 4.05 87.33 ± 4.04 87.31 ± 4.05 87.31 ± 4.05 87.31 ± 4.05
Top-5 88.08 ± 3.94 88.1 ± 3.94 88.08 ± 3.94 88.08 ± 3.94 88.08 ± 3.94
Top-6 88.08 ± 3.94 88.13 ± 3.93 88.08 ± 3.94 88.08 ± 3.94 88.07 ± 3.94

It is clearly seen that the Top-3 features (maxh, maxb, and maxt) from RF and ET feature selection techniques produced the best performance of overall accuracy, and weighted precision, sensitivity, specificity, and F1-score of 89.61%, 89.67%, 89.62%, 89.62%, and 89.61%, respectively, using RF classifier for sex prediction. It was noticed that six features were required in the case of the XGBoost feature selection technique to produce the best performance of overall accuracy, weighted precision, sensitivity, specificity, and F1-score of 89.23%, 89.31%, 89.23%, 89.24%, and 89.22%, respectively, using AdaBoost classifier, whereas similar performance was produced by only three features from RF and ET feature selection techniques with RF classifier.

Development and validation of different ML and stacking models

We investigated the best combination of three features (maxh, maxb, and maxt) and selected the best ML classifiers among eight classifiers as base models and trained different ML classifiers as meta-learners. We selected top two models (RF and ET) where the overall accuracies, and weighted precision, sensitivity, specificity, and F1-score were 89.23%, 88.64%, 90.00%, 88.46%, 89.31%, and 85.34%, 85.27%, 85.03%, 85.45%, 85.14%, respectively (Table 6). The stacking approach was trained with RF and ET classifiers as a base learner and Gradient Boosting classifier as meta learner outperformed other meta learner classifiers with the performance of overall accuracy, and weighted precision, sensitivity, specificity, and F1-score of 90.77%, 89.55%, 92.3%, 89.23%, and 90.9%, respectively.

Table 6.

Comparison of the average performance metrics from five-fold cross-validation for different classifiers and stacking classifiers

Classifier Weighted with 95% CI
Overall accuracy Precision Sensitivity Specificity F1-score
Linear discriminant analysis 83.85 ± 4.47 83.85 ± 4.47 83.85 ± 4.47 83.85 ± 4.47 83.85 ± 4.47
XGB classifier 81.15 ± 4.75 81.2 ± 4.75 81.15 ± 4.75 81.15 ± 4.75 81.15 ± 4.75
Random forest classifier 89.23 ± 3.77 88.64 ± 3.86 90 ± 3.65 88.46 ± 3.88 89.31 ± 3.76
Logistic regression 83.85 ± 4.47 83.85 ± 4.47 83.85 ± 4.47 83.85 ± 4.47 83.85 ± 4.47
Extra trees classifier 85.34 ± 4.3 85.27 ± 4.31 85.03 ± 4.34 85.45 ± 4.29 85.14 ± 4.32
AdaBoost classifier 83.46 ± 4.52 83.51 ± 4.51 83.46 ± 4.52 83.46 ± 4.52 83.46 ± 4.52
K neighbors classifier 83.08 ± 4.56 83.27 ± 4.54 83.08 ± 4.56 83.08 ± 4.56 83.05 ± 4.56
Gradient boosting classifier 85 ± 4.34 85.1 ± 4.33 85 ± 4.34 85 ± 4.34 85.03 ± 4.34
Stacking model 90.77 ± 3.52 89.55 ± 3.72 92.3 ± 3.24 89.23 ± 3.77 90.9 ± 3.5

Figure 5A shows the confusion matrix of the best performing ML classifier (RF classifier), and Fig. 5B shows the confusion matrix of the best performing stacking model (with Gradient Boosting classifiers as a meta learner). It can be noticed that even with the best performing RF classifier, 13 out of 130 male subjects were miss-classified as female and 15 out of 130 female subjects were miss-classified as male when the stacking model with Gradient Boosting classifier as a meta learner outperformed other ML classifiers, where 120 out of 130 male subjects were correctly classified as male and 116 out of 130 females were correctly identified as a female with the stacking model. Thus, the stacking model outperformed other state-of-the-art ML classifiers.

Fig. 5.

Fig. 5

Confusion matrix for sex prediction model using A random forest classifier and B stacking classifier

Figure 6 shows the AUC) /ROC curve (also known as AUROC (area under the receiver operating characteristics)) for sex identification using different ML classifiers, which is one of the most important evaluation metrics for checking any classification model’s performance. It is apparent that the stacking model outperformed other ML classifiers for classification with 92.65% AUC (Fig. 6).

Fig. 6.

Fig. 6

ROC curve for sex prediction classifier using different ML models and stacking classifier

Discussion

Sex prediction from measurements of bones that display sexual dimorphism is an important aspect of forensic anthropology. Population-specific standards which are generally considered to provide the best estimation of sex have been published for the skull and postcranial elements in different parts of the world [30, 40, 44, 50, 75, 76]. These became necessary because of the observed variation in the display of sexual dimorphism between different population groups [26]. Consequently, the application of standards for population groups is not encouraged for other groups. One disadvantage of using population-specific standards is having prior knowledge of the population group of the skeleton under forensic analysis [63, 64].

In the present study, patella measurements of South Africans were shown to be sexually dimorphic and which is consistent with the results of previous studies on patella of Italians [77], Americans [78], Iranians [79], Spaniards [80], African Americans [81], Japanese [82], Turks [83], and Swiss [84]. The range of average accuracies obtained for pooled multivariate discriminant function equations (DFEs) and stacking ML technique in the current study (81.9–90.8%, Tables 4 and 6) compares well with those presented for previous studies in other parts of the world (Table 7). It is interesting to note that the highest average accuracies for all studies that utilized skeletal collection in the acquisition of data are approximately not more than 85% [46, 66, 80, 81, 84]. Other studies in which data were collected from radiological modalities and autopsy acquired data presented higher average accuracies in correct sex classification. This is an indication that the source of data and how these data are collected may influence the outcome of the results of the analysis.

Table 7.

Comparison of average accuracies in correct sex classification from previous studies and present study

Study Year Pop Gp Data source Measurements Highest average accuracy
MAXH MAXW MAXT HAF MAFB LAFB
Males Females Males Females Males Females Males Females Males Females Males Females
Mean SD Mean SD Mean SD Mean SD Mean SD Mean SD Mean SD Mean SD Mean SD Mean SD Mean SD Mean SD
Introna et al. [68] 1998 Italian Skeletal collection 41.2 2.9 37 2.9 43.2 2.7 39.4 3.2 20.4 1.9 18.3 1.6 63.6
Bidmos et al. [46] 2005 South Afrfican whites Skeletal collection 43.6 3.1 38.7 3.1 45.3 3.3 40.3 3.3 20.4 1.8 18.4 1.8 30.8 2.6 27.6 3.1 20.5 1.8 18.2 2.2 28.1 2.7 25.0 2.5 85.0
Dayal and Bidmos [57] 2005 South Afrfican blacks Skeletal collection 41.2 3.1 36.5 2.2 43.3 2.5 39.0 2.9 20.6 1.4 18.2 1.7 29.6 3.0 27.9 2.7 18.4 1.9 16.3 1.6 25.3 2.1 22.9 2.1 85
Mahfouz et al. [69] 2007 Americans Skeletal collection, live and cadaveric Non linear method 93.5
Akhlagi et al. [70] 2010 Iranians Autopsy: Caliper 44.7 2.7 38.4 2.0 45.5 2.2 40.1 1.9 21.9 1.9 20.3 1.4 92.9
Peckmann et al. [71] 2016 Spanish Skeletal collection 42.9 3.0 37.9 3.0 44.6 3.3 4.3 2.9 20.3 1.9 18.1 1.8 25.7 2.3 23.6 2.9 19.2 2.3 17.0 1.8 24.8 2.3 22.5 2.1 84.8
Peckmann and Fischer [72] 2016 African Americans Skeletal collection 44.8 3.5 39.8 3.3 45.0 3.8 39.8 3.3 20.8 2.0 19.2 2.0 32.9 2.3 29.6 2.6 20.8 2.2 18.0 2.1 24.4 2.2 21.5 2.3 85.0
Michiue et al. [73] 2018 Japanese Autopsy: CT 44.1 3.5 38.8 2.7 22.5 1.5 38.8 2.7 87.7
Teke et al. [74] 2018 Turkish MRI scans patients 41.3 3.4 35.8 2.3 46.3 3.0 40.4 3.0 22.4 1.6 19.9 1.7 89.0
Indra et al. [75] 2021 Swiss Skeletal collection 44.2 3.1 38.7 2.5 45.3 3.3 40.3 2.9 22.7 2.1 20.3 1.7 83.8
Current study 2022 South Africans Skeletal collection 42.1 3.1 37.3 2.8 43.8 3.3 38.8 3.2 20.4 1.6 18.1 1.8 30.3 2.7 27.4 2.8 19.7 2.0 17.5 2.2 26.3 2.7 23.4 2.6 84.2

In addition, the average accuracies for the pooled data for the patella from the current study using discriminant function analysis (81.9–84.2%) are similar to those observed for SAED (75–85%: [46]) and SAAD (78–85%: [66]). The highest drop in average accuracies in the current study (0.8%) is lower than those from population-specific DFEs for SAED and SAAD which were 2.5% and 3.3%, respectively. This observation of a lower drop of average accuracies for DFEs obtained from pooled data compared to population-specific DFEs agrees with Bidmos and Mazengenya [85] in which the highest drop in average accuracies for pooled data DFEs was 0.9%. This observation indicates a better validity of pooled DFEs compared to population-specific DFEs. Another previously documented advantage of the application of DFEs from pooled data is that they can be applied to an unknown skeleton without the prior knowledge of the population group [64].

The same performance trend is observed in the current study using the ML algorithm compared to the conventional statistical model. The standards generated for sex classification produced higher average accuracies (Table 4) compared to those generated using discriminant function analysis (Table 3). Compared to the average accuracies for the pooled data for the patella from the current study using discriminant function analysis (81.9–84.2%), the stacking machine learning approach provides an overall accuracy of 90.77%. This clearly indicates that with the application of the machine learning paradigm a better classification of sex from the patella measurement is possible.

From the aforementioned, linear and volumetric measurements of the patella are useful in human identification and have produced acceptably high average accuracies in correct sex classification. However, human identification from skeletal remains can be demanding especially in a country like South Africa with diverse population groups. Consequently, the application of population-specific DFEs in the human identification process will require the prior assignment of the population group which might be difficult if not impossible in cases where complete skeletons are not available or in the absence of bones that display obvious population-specific traits. Another confounding problem is the difficulty in the assignment of population groups to individuals who fall within the boundaries of other population groups [52]. This has led some researchers [63, 64] to propose the idea of a generation of generic standards for the estimation of sex and stature, especially for population groups that have similarities. In both studies, the authors argue for the generation and use of generic equations for sex assignment [63] and stature estimation [64] citing the lack of adequate data and bone collections from which data could be collected for the derivation of population-specific standards in some countries.

The pelvic bone is considered one of the most sexually dimorphic bones in the body based on its design for parturition in females. Measurements of this bone have been used in the generation of population-specific DFEs in different parts of the world [2]. Steyn and Patriquin [63] assessed the reliability of population-specific DFEs compared to those from pooled data from diverse population groups. They reported a comparable performance of population-specific and generic DFEs with regard to classification rates and concluded that population-specific equations are not superior to generic equations with regard to sex prediction using dimensions of the pelvic bone. Macaluso Jr [86] evaluated the reliability of generic equations that were published by Steyn and Patriquin [63] on a French sample and reported that the average accuracies of the pooled data remained unchanged when applied to a French sample. In addition, there was no significant difference between the average accuracies obtained from the use of population-specific equations and generic equations [86]. This observation provided further proof of the usefulness and applicability of generic equations for sex prediction using pelvic measurements to other related population groups, where the application of ML can significantly help.

Attempts have also been made to apply the notion of non-superiority of population-specific equations over generic equations using measurements of the vertebrae. Hora and Sládek [87] observed that anteroposterior and mediolateral body diameters were found to be universally applicable in sex prediction while other measurements of the studied vertebrae showed population specificity in the assignment of sex. In a similar study, Bidmos and Mazengenya [85] investigated the utility of pooled data in the generation of generic equations for sex prediction. They evaluated the accuracies of population-specific equations formulated from measurements of long upper limb bones of South Africans and noted that the average accuracies of generic equations are acceptably high (81 to 87%). In addition, the cross-validated accuracies remained largely unchanged thereby confirming the usefulness of these equations in cases where it becomes difficult to establish the population affinity of the skeletal remain under forensic investigation.

Recently, Indra et al. [84] assessed the validity of population-specific DFEs formulated for patella measurements of the contemporary Spanish population group on a Swiss sample. The average accuracies obtained by Indra et al. [84] ranged from 63 to 84% for patella which was similar to those presented in an earlier study by Peckmann et al. [80]. The results of the current study in which the average accuracies obtained for generic equations are comparable to those presented for population-specific equations for South Africans of European [46] and African descent [66] in agreement with the observation made in previous studies [63, 8487]. This, therefore, shows the utility of generic equations when the patella is available for forensic analysis in South Africa.

The range of average accuracies for generic equations formulated in the current study (81.9–90.8%) is similar to those obtained for population-specific equations derived for South Africans of European (67.5–85%) and African descent (78.3–85%). This is in agreement with the observation that was made by Indra et al. [84].

Conclusions

Prediction of sex from recovered or discovered bones in human identification is a very important step in forensic anthropologists along with the estimation of age, stature, and population affinity. In this study, we have used a dataset of 100 people collected from a sample of patella of Mixed Ancestry South Africans (MASA). Six parameters maxh, maxw, maxt, haf, lafb, and mafb were used. Two types of investigation have been carried out in this study to compare the performance of conventional statistical analysis versus the classical machine learning techniques in the estimation of sex. Different discriminant function analyses were performed for measurements that exhibited significant differences between male and female mean measurements. On the other hand, several ML algorithms were trained, validated, and tested to identify the best feature combination for detecting the sex from the patella measurements. The range of average accuracies obtained for pooled multivariate DFEs is 81.9–84.2% while the stacking ML technique provides 90.8% accuracy which compares well with those presented in previous studies. In conclusion, findings from the current study show that generic models formulated from measurements of the patella of different population groups in South Africa are useful resent with reasonably high average accuracies. Consequently, they are useful in the prediction of sex in cases when the population affinity is either difficult or impossible to ascertain and their applicability to populations of Southern Africa will require validation studies in individual populations from different countries in the region.

Acknowledgements

The authors sincerely thank those who donated their bodies to science so that anatomical research could be performed. Results from such research can potentially increase mankind's overall knowledge which can then improve forensic science. Therefore, these donors and their families deserve our highest gratitude. We are grateful to the School of Anatomical Sciences of the University of the Witwatersrand for giving access to the Raymond Dart Collections.

Funding

Open Access funding provided by the Qatar National Library.

Data availability

Data available on request.

Declarations

Ethics approval

Ethical clearance waiver was obtained with number W-CJ-140604–1.

Informed consent

Not applicable.

Conflict of interest

The authors declare no competing interests.

Research involving human participants and/or animals

Not applicable.

Footnotes

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Loth SR, İşcan MY (2000) Morphological age estimation. In: Siegel JA, Saukko PJ, Knupfer GC (eds) Encyclopaedia of forensic sciences. Academic Press, London, p 1600
  • 2.İşcan and Steyn (2013) The human skeleton in forensic medicine, 3rd ed. Charles C Thomas Publisher, p 493. 10.1002/ajpa.22754
  • 3.Ðuricá MÐ, Rakočevic Z, Rakočevic´b R, Ðonicá DÐ. The reliability of sex determination of skeletons from forensic context in the Balkans. Forensic Sci Int. 2004;147:159–164. doi: 10.1016/j.forsciint.2004.09.111. [DOI] [PubMed] [Google Scholar]
  • 4.Rogers T, Saunders S. Accuracy of sex determination using morphological traits of the human pelvis. J Forensic Sci. 1994;39:13683J. doi: 10.1520/JFS13683J. [DOI] [PubMed] [Google Scholar]
  • 5.Kimmerle EH, Ross A, Slice D. Sexual dimorphism in America: geometric morphometric analysis of the craniofacial region. J Forensic Sci. 2008;53:54–57. doi: 10.1111/j.1556-4029.2007.00627.x. [DOI] [PubMed] [Google Scholar]
  • 6.Bigoni L, Velemínská J, Brůžek J. Three-dimensional geometric morphometric analysis of cranio-facial sexual dimorphism in a Central European sample of known sex. Homo. 2010;61:16–32. doi: 10.1016/J.JCHB.2009.09.004. [DOI] [PubMed] [Google Scholar]
  • 7.Franklin D, Cardini A, Flavel A, Kuliukas A. The application of traditional and geometric morphometric analyses for forensic quantification of sexual dimorphism: preliminary investigations in a Western Australian population. Int J Legal Med. 2012;126:549–558. doi: 10.1007/s00414-012-0684-8. [DOI] [PubMed] [Google Scholar]
  • 8.Rusk KM, Ousley SD. An evaluation of sex-and ancestry-specific variation in sacral size and shape using geometric morphometrics. Am J Phys Anthropol. 2016;159:646–654. doi: 10.1002/ajpa.22926. [DOI] [PubMed] [Google Scholar]
  • 9.Čechová M, Dupej J, Brůžek J, et al. Sex estimation using external morphology of the frontal bone and frontal sinuses in a contemporary Czech population. Int J legal Med. 2019;133:1285–1294. doi: 10.1007/s00414-019-02063-8. [DOI] [PubMed] [Google Scholar]
  • 10.Bertsatos A, Chovalopoulou ME, Brůžek J, Bejdová Š. Advanced procedures for skull sex estimation using sexually dimorphic morphometric features. Int J Legal Med. 2020;134:1927–1937. doi: 10.1007/s00414-020-02334-9. [DOI] [PubMed] [Google Scholar]
  • 11.del Bove A, Profico A, Riga A, et al. A geometric morphometric approach to the study of sexual dimorphism in the modern human frontal bone. Am J Phys Anthropol. 2020;173:643–654. doi: 10.1002/ajpa.24154. [DOI] [PubMed] [Google Scholar]
  • 12.Kajanoja P. Sex determination of finnish crania by discriminant function analysis. Am J Phys Anthropol. 1966;24:29–33. doi: 10.1002/ajpa.1330240104. [DOI] [PubMed] [Google Scholar]
  • 13.İşcan MY, Yoshino M, Kato S. Sexual dimorphism in modern Japanese crania. Am J Hum Biol. 1995;7:459–464. doi: 10.1002/AJHB.1310070407. [DOI] [PubMed] [Google Scholar]
  • 14.Patil KR, Mody RN. Determination of sex by discriminant function analysis and stature by regression analysis: a lateral cephalometric study. Forensic Sci Int. 2004;147:175–180. doi: 10.1016/j.forsciint.2004.09.071. [DOI] [PubMed] [Google Scholar]
  • 15.Spradley MK, Jantz RL. Sex estimation in forensic anthropology: skull versus postcranial elements. J Forensic Sci. 2011;56:289–296. doi: 10.1111/j.1556-4029.2010.01635.x. [DOI] [PubMed] [Google Scholar]
  • 16.Ogawa Y, Imaizumi K, Miyasaka S, Yoshino M. Discriminant functions for sex estimation of modern Japanese skulls. J Forensic Legal Med. 2013;20:234–238. doi: 10.1016/j.jflm.2012.09.023. [DOI] [PubMed] [Google Scholar]
  • 17.Marinescu M, Panaitescu V, Rosu M, Maru N, Punga A. Sexual dimorphism of crania in a Romanian population: discriminant function analysis approach for sex estimation. Rom J Leg Med. 2014;22:21–26. doi: 10.4323/rjlm.2014.21. [DOI] [Google Scholar]
  • 18.Marino EA. Sex estimation using the first cervical vertebra. Am J Phys Anthropol. 1995;97:127–133. doi: 10.1002/AJPA.1330970205. [DOI] [PubMed] [Google Scholar]
  • 19.Garoufi N, Bertsatos A, Chovalopoulou ME, Villa C. Forensic sex estimation using the vertebrae: an evaluation on two European populations. Int J Legal Med. 2020;134:2307–2318. doi: 10.1007/S00414-020-02430-W. [DOI] [PubMed] [Google Scholar]
  • 20.Oikonomopoulou EK, Valakos E, Nikita E. Population-specificity of sexual dimorphism in cranial and pelvic traits: evaluation of existing and proposal of new functions for sex assessment in a Greek assemblage. Int J Legal Med. 2017;131:1731–1738. doi: 10.1007/s00414-017-1655-x. [DOI] [PubMed] [Google Scholar]
  • 21.Knecht S, Nogueira L, Maël S, et al. Sex estimation from the greater sciatic notch: a comparison of classical statistical models and machine learning algorithms. Int J Legal Med. 2021;135:2603–2613. doi: 10.1007/s00414-021-02700-1. [DOI] [PubMed] [Google Scholar]
  • 22.Cao Y, Ma Y, Vieira DN, et al. A potential method for sex estimation of human skeletons using deep learning and three-dimensional surface scanning. Int J Legal Med. 2021;135:2409–2421. doi: 10.1007/s00414-021-02675-z. [DOI] [PubMed] [Google Scholar]
  • 23.Holman DJ, Bennett KA. Determination of sex from arm bone measurements. Am J Phys Anthropol. 1991;84:421–426. doi: 10.1002/ajpa.1330840406. [DOI] [PubMed] [Google Scholar]
  • 24.Işcan MY, Loth SR, King CA, et al. Sexual dimorphism in the humerus: a comparative analysis of Chinese, Japanese and Thais. Forensic Sci Int. 1998;98:17–29. doi: 10.1016/S0379-0738(98)00119-4. [DOI] [PubMed] [Google Scholar]
  • 25.Mall G, Hubig M, Èttner AB, et al. Sex determination and estimation of stature from the longbones of the arm. Forensic Sci Int. 2001;117:23–30. doi: 10.1016/S0379-0738(00)00445-X. [DOI] [PubMed] [Google Scholar]
  • 26.Sakaue K. Sexual determination of long bones in recent Japanese. Anthropol Sci. 2004;112:75–81. doi: 10.1537/ase.00067. [DOI] [Google Scholar]
  • 27.Frutos LR. Metric determination of sex from the humerus in a Guatemalan forensic sample. Forensic Sci Int. 2004;147:153–157. doi: 10.1016/j.forsciint.2004.09.077. [DOI] [PubMed] [Google Scholar]
  • 28.Kranioti EF, Michalodimitrakis M. Sexual dimorphism of the humerus in contemporary cretans–a population-specific study and a review of the literature. J Forensic Sci. 2009;54:996–1000. doi: 10.1111/j.1556-4029.2009.01103.x. [DOI] [PubMed] [Google Scholar]
  • 29.Celbis O, Agritmis H. Estimation of stature and determination of sex from radial and ulnar bone lengths in a Turkish corpse sample. Forensic Sci Int. 2006;159:135–139. doi: 10.1016/j.forsciint.2005.05.016. [DOI] [PubMed] [Google Scholar]
  • 30.Black TKA. A new method for assessing the sex of fragmentary skeletal remains: femoral shaft circumference. Am J Phys Anthropol. 1978;48:227–231. doi: 10.1002/ajpa.1330480217. [DOI] [PubMed] [Google Scholar]
  • 31.Işcan MY, Shihai D. Sexual dimorphism in the Chinese femur. Forensic Sci Int. 1995;74:79–87. doi: 10.1016/0379-0738(95)01691-B. [DOI] [PubMed] [Google Scholar]
  • 32.Mall G, Graw M, Gehring KD, Hubig M. Determination of sex from femora. Forensic Sci Int. 2000;113:315–321. doi: 10.1016/S0379-0738(00)00240-1. [DOI] [PubMed] [Google Scholar]
  • 33.King C, İşcan MY, Loth SR. Metric and comparative analysis of sexual dimorphism in the Thai femur. J Forensic Sci. 1998;43:954–958. doi: 10.1520/JFS14340J. [DOI] [PubMed] [Google Scholar]
  • 34.Jantz RL, Kimmerle EH, Baraybar JP. Sexing and stature estimation criteria for Balkan populations. J Forensic Sci. 2008;53:601–605. doi: 10.1111/j.1556-4029.2008.00716.x. [DOI] [PubMed] [Google Scholar]
  • 35.Boldsen JL, Milner GR, Boldsen SK. Sex estimation from modern american humeri and femora, accounting for sample variance structure. Am J Phys Anthropol. 2015;158:745–750. doi: 10.1002/ajpa.22812. [DOI] [PubMed] [Google Scholar]
  • 36.Moore MK, DiGangi EA, Niño Ruíz FP, et al. Metric sex estimation from the postcranial skeleton for the Colombian population. Forensic Sci Int. 2016;262:286.e1–286.e8. doi: 10.1016/J.FORSCIINT.2016.02.018. [DOI] [PubMed] [Google Scholar]
  • 37.Steyn M, Işcan MY. Sexual dimorphism in the crania and mandibles of South African whites. Forensic Sci Int. 1998;98:9–16. doi: 10.1016/S0379-0738(98)00120-0. [DOI] [PubMed] [Google Scholar]
  • 38.Dayal MR, Spocter MA, Bidmos MA. An assessment of sex using the skull of black South Africans by discriminant function analysis. HOMO- J Comp Hum Biol. 2008;59:209–221. doi: 10.1016/j.jchb.2007.01.001. [DOI] [PubMed] [Google Scholar]
  • 39.Vance VL, Steyn M, L’Abbé EN. Nonmetric sex determination from the distal and posterior humerus in black and white South Africans. J Forensic Sci. 2011;56:710–714. doi: 10.1111/j.1556-4029.2011.01724.x. [DOI] [PubMed] [Google Scholar]
  • 40.Steyn M, Işcan MY. Sex determination from the femur and tibia in South African whites. Forensic Sci Int. 1997;90:111–119. doi: 10.1016/S0379-0738(97)00156-4. [DOI] [PubMed] [Google Scholar]
  • 41.Asala SA, Bidmos MA, Dayal MR. Discriminant function sexing of fragmentary femur of South African blacks. Forensic Sci Int. 2004;145:25–29. doi: 10.1016/j.forsciint.2004.03.010. [DOI] [PubMed] [Google Scholar]
  • 42.Barrier ILO, L’Abbé EN. Sex determination from the radius and ulna in a modern South African sample. Forensic Sci Int. 2008;179:85.e1–85.e7. doi: 10.1016/j.forsciint.2008.04.012. [DOI] [PubMed] [Google Scholar]
  • 43.Krüger GC, L’abbé EN, Stull KE. Sex estimation from the long bones of modern South Africans. Int J Legal Med. 2017;131:275–285. doi: 10.1007/s00414-016-1488-z. [DOI] [PubMed] [Google Scholar]
  • 44.Bidmos MA, Asala SA. Discriminant function sexing of the calcaneus of the South African whites. J Forensic Sci. 2003;48:1213–1218. doi: 10.1520/JFS2003104. [DOI] [PubMed] [Google Scholar]
  • 45.Bidmos MA, Asala SA. Sexual dimorphism of the calcaneus of South African blacks. J Forensic Sci. 2004;49(3):446–450. doi: 10.1520/JFS2003254. [DOI] [PubMed] [Google Scholar]
  • 46.Bidmos MA, Steinberg N, Kuykendall KL. Patella measurements of South African whites as sex assessors. HOMO- J Comp Hum Biol. 2005;56:69–74. doi: 10.1016/J.JCHB.2004.10.002. [DOI] [PubMed] [Google Scholar]
  • 47.Dayal MR, Kegley AD, Štrkalj G, et al. The history and composition of the Raymond A. Dart collection of human skeletons at the University of the Witwatersrand, Johannesburg. South Africa Am J Phys Anthropol. 2009;140:324–335. doi: 10.1002/ajpa.21072. [DOI] [PubMed] [Google Scholar]
  • 48.L’Abbé EN, Loots M, Meiring JH. The pretoria bone collection: a modern South African skeletal sample. Homo. 2005;56:197–205. doi: 10.1016/J.JCHB.2004.10.004. [DOI] [PubMed] [Google Scholar]
  • 49.Gibbon VE, Morris AG. UCT Human skeletal repository: its stewardship, history, composition and educational use. Homo. 2021;72:139–147. doi: 10.1127/HOMO/2021/1402. [DOI] [PubMed] [Google Scholar]
  • 50.Krüger GC, L'Abbé EN, Stull KE, Kenyhercz MW. Sexual dimorphism in cranial morphology among modern South Africans. Int J Legal Med. 2015;129(4):869–75. doi: 10.1007/s00414-014-1111-0. [DOI] [PubMed] [Google Scholar]
  • 51.Liebenberg L, Krüger GC, L’Abbé EN, Stull KE. Postcraniometric sex and ancestry estimation in South Africa: a validation study. Int J Legal Med. 2019;133:289–296. doi: 10.1007/s00414-018-1865-x. [DOI] [PubMed] [Google Scholar]
  • 52.Mokoena P, Billings BK, Gibbon V, et al. Development of discriminant functions to estimate sex in upper limb bones for mixed ancestry South Africans. Sci Justice. 2019;59:660–666. doi: 10.1016/j.scijus.2019.06.007. [DOI] [PubMed] [Google Scholar]
  • 53.Austin PC, Lee DS, Steyerberg EW, Tu JV. Regression trees for predicting mortality in patients with cardiovascular disease: what improvement is achieved by using ensemble-based methods? Biom J. 2012;54:657–673. doi: 10.1002/bimj.201100251. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Shouval R, Hadanny A, Shlomo N, et al. Machine learning for prediction of 30-day mortality after ST elevation myocardial infraction: an acute coronary syndrome Israeli Survey data mining study. Int J Cardiol. 2017;246:7–13. doi: 10.1016/j.ijcard.2017.05.067. [DOI] [PubMed] [Google Scholar]
  • 55.Pieszko K, Hiczkiewicz J, Budzianowski P, Budzianowski J, Rzeźniczak J, Pieszko K, Burchardt P (2019) Predicting long-term mortality after acute coronary syndrome using machine learning techniques and hematological markers. Dis Markers: 1–9. 10.1155/2019/9056402 [DOI] [PMC free article] [PubMed]
  • 56.Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25:44–56. doi: 10.1038/s41591-018-0300-7. [DOI] [PubMed] [Google Scholar]
  • 57.Deo RC. Machine learning in medicine. Circulation. 2015;132:1920–1930. doi: 10.1161/CIRCULATIONAHA.115.001593. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Harrell Jr, Frank E (2019) Glossary of statistical terms. Vanderbilt University School of Medicine. https://hbiostat.org/doc/glossary.pdf. Accessed 1 May 2022 
  • 59.Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542:115–118. doi: 10.1038/nature21056. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Shung D, Simonov M, Gentry M, et al. Machine learning to predict outcomes in patients with acute gastrointestinal bleeding: a systematic review. Dig Dis Sci. 2019;64:2078–2087. doi: 10.1007/s10620-019-05645-z. [DOI] [PubMed] [Google Scholar]
  • 61.Mortazavi BJ, Bucholz EM, Desai NR, Huang C, Curtis JP, Masoudi FA, Shaw RE, Negahban SN, Krumholz HM. Comparison of machine learning methods with national cardiovascular data registry models for prediction of risk of bleeding after percutaneous coronary intervention. JAMA Netw Open. 2019;2(7):e196835–e196835. doi: 10.1001/jamanetworkopen.2019.68352:e196835-e196835. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Angraal S, Mortazavi BJ, Gupta A, et al. Machine learning prediction of mortality and hospitalization in heart failure with preserved ejection fraction. JACC Heart Fail. 2020;8:12–21. doi: 10.1016/j.jchf.2019.06.013. [DOI] [PubMed] [Google Scholar]
  • 63.Steyn M, Patriquin ML. Osteometric sex determination from the pelvis—does population specificity matter? Forensic Sci Int. 2009;191:113.e1–113.e5. doi: 10.1016/J.FORSCIINT.2009.07.009. [DOI] [PubMed] [Google Scholar]
  • 64.Albanese J, Tuck A, Gomes J, Cardoso HF (2016) An alternative approach for estimating stature from long bones that is not population-or group-specific. Forensic Sci Int 259:59–68. 10.1016/j.forsciint.2015.12.011 [DOI] [PubMed]
  • 65.Howley D, Howley P, Oxenham MF. Estimation of sex and stature using anthropometry of the upper extremity in an Australian population. Forensic Sci Int. 2018;287:220.e1–220.e10. doi: 10.1016/J.FORSCIINT.2018.03.017. [DOI] [PubMed] [Google Scholar]
  • 66.Dayal MR, Bidmos MA. Discriminating sex in South African blacks using patella dimensions. J Forensic Sci. 2005;50:1–4. doi: 10.1520/JFS2004306. [DOI] [PubMed] [Google Scholar]
  • 67.Lin LK. A concordance correlation coefficient to evaluate reproducibility. Biometrics. 1989;45:255. doi: 10.2307/2532051. [DOI] [PubMed] [Google Scholar]
  • 68.Singh D, Singh B. Investigating the impact of data normalization on classification performance. Appl Soft Comput. 2020;97:105524. doi: 10.1016/j.asoc.2019.105524. [DOI] [Google Scholar]
  • 69.Chen T, Guestrin C (2016) XGBoost: reliable large-scale tree boosting system. arXiv
  • 70.Sharaff A, Gupta H. Extra-tree classifier with metaheuristics approach for email classification. Adv Intell Syst. 2019;924:189–197. doi: 10.1007/978-981-13-6861-5_17. [DOI] [Google Scholar]
  • 71.Biau G, Scornet E. A random forest guided tour. TEST. 2016;25:197–227. doi: 10.1007/s11749-016-0481-7. [DOI] [Google Scholar]
  • 72.Biau G, Scornet E. Rejoinder on: a random forest guided tour. TEST. 2016;25:264–268. doi: 10.1007/S11749-016-0488-0. [DOI] [Google Scholar]
  • 73.Guo G, Wang H, Bell D et al (2003) KNN model-based approach in classification. In: Meersman, R., Tari, Z., Schmidt, D.C. (eds) On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE. OTM 2003. Lect Notes in Comput Sci vol 2888. Springer, Berlin, Heidelberg. 2888:986–996. 10.1007/978-3-540-39964-3_62
  • 74.Subasi A. Practical machine learning for data analysis using python. Academic Press; 2020. [Google Scholar]
  • 75.DiBennardo R, Taylor Jv. Sex assessment of the femur: a test of a new method. Am J Phys Anthropol. 1979;50:635–637. doi: 10.1002/AJPA.1330500415. [DOI] [PubMed] [Google Scholar]
  • 76.Garcia S. Is the circumference at the nutrient foramen of the tibia of value to sex determination on human osteological collections? testing a new method. Int J Osteoarchaeol. 2012;22:361–365. doi: 10.1002/OA.1202. [DOI] [Google Scholar]
  • 77.Introna F, di Vella G, Campobasso Cp. Sex determination by discriminant analysis of patella measurements. Forensic Sci Int. 1998;95:39–45. doi: 10.1016/S0379-0738(98)00080-2. [DOI] [PubMed] [Google Scholar]
  • 78.Mahfouz M, Badawi A, Merkl B, et al. Patella sex determination by 3D statistical shape models and nonlinear classifiers. Forensic Sci Int. 2007;173:161–170. doi: 10.1016/j.forsciint.2007.02.024. [DOI] [PubMed] [Google Scholar]
  • 79.Akhlaghi M, Sheikhazadi A, Naghsh A, Dorvashi G. Identification of sex in Iranian population using patella dimensions. J Forensic Legal Med. 2010;17:150–155. doi: 10.1016/J.JFLM.2009.11.005. [DOI] [PubMed] [Google Scholar]
  • 80.Peckmann TR, Meek S, Dilkie N, Rozendaal A. Determination of sex from the patella in a contemporary Spanish population. J Forensic Legal Med. 2016;44:84–91. doi: 10.1016/j.jflm.2016.09.007. [DOI] [PubMed] [Google Scholar]
  • 81.Peckmann TR, Fisher B. Sex estimation from the patella in an African American population. J Forensic Legal Med. 2018;54:1–7. doi: 10.1016/j.jflm.2017.12.002. [DOI] [PubMed] [Google Scholar]
  • 82.Michiue T, Hishmat A, Oritani S, et al. Virtual computed tomography morphometry of the patella for estimation of sex using postmortem Japanese adult data in forensic identification. Forensic Sci Int. 2018;285:206–e1. doi: 10.1016/j.forsciint.2017.11.029. [DOI] [PubMed] [Google Scholar]
  • 83.Teke YH, Ünlütürk Ö, Günaydin E, et al. Determining gender by taking measurements from magnetic resonance images of the patella. J Forensic Legal Med. 2018;58:87–92. doi: 10.1016/j.jflm.2018.05.002. [DOI] [PubMed] [Google Scholar]
  • 84.Indra L, Vach W, Desideri J, et al. Testing the validity of population-specific sex estimation equations: an evaluation based on talus and patella measurements. Sci Justice. 2021;61:555–563. doi: 10.1016/j.scijus.2021.06.011. [DOI] [PubMed] [Google Scholar]
  • 85.Bidmos MA, Mazengenya P. Accuracies of discriminant function equations for sex estimation using long bones of upper extremities. Int J Legal Med. 2021;135:1095–102. doi: 10.1007/s00414-020-02458-y. [DOI] [PubMed] [Google Scholar]
  • 86.Macaluso PJ. The efficacy of sternal measurements for sex estimation in South African blacks. Forensic Sci Int. 2010;202:111.e1–111.e7. doi: 10.1016/j.forsciint.2010.07.019. [DOI] [PubMed] [Google Scholar]
  • 87.Hora M, Sládek V. Population specificity of sex estimation from vertebrae. Forensic Sci Int. 2018;291:279.e1–279.e12. doi: 10.1016/j.forsciint.2018.08.015. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data available on request.


Articles from International Journal of Legal Medicine are provided here courtesy of Springer

RESOURCES