Table 4.
Mean training and validation scores for the best performing machine learning models using a fixed threshold of 4.0SUV and 1.5 × mean liver SUV thresholding segmentation techniques
Model | Selected features | Hyperparameters | Mean train score (95% CI) | Mean validation score (95% CI) |
---|---|---|---|---|
4.0 SUV | ||||
Support vector machine | Age, PET GLCM Imc1, PET wavelet-LLH GLCM Imc2, PET wavelet-HLL GLSZM small area emphasis, PET log-sigma-2-0-mm-3D GLSZM small area emphasis | C: 15.78, Gamma: 0.000794, Kernel: sigmoid | 0.68 ± 0.004 | 0.66 ± 0.02 |
Logistic regression | Age, PET least axis length, PET wavelet-HLL GLCM correlation, PET wavelet-HLH GLCM Idmn, CT wavelet-HLL GLSZM large area low grey level emphasis | C: 1, penalty: l2, Solver: lbfgs | 0.80 ± 0.002 | 0.78 ± 0.01 |
Random forest | Age | Bootstrap: true, Max depth: 1, min samples per leaf: 11, min samples per split: 32, number of estimators: 213 | 0.67 ± 0.004 | 0.64 ± 0.02 |
Multi-layer perceptron | Age, PET major axis length, PET wavelet-HHL GLCM Imc1, PET lbp-3D-k first order 10th percentile | Learning rate: invscaling, Solver: sgd | 0.68 ± 0.004 | 0.68 ± 0.02 |
1.5 × mean liver SUV | ||||
Support vector machine | PET first order 90th percentile, PET wavelet-LHH GLDM dependence non-uniformity normalised | C: 3.398, Gamma: 0.1005, Kernel: sigmoid | 0.54 ± 0.008 | 0.55 ± 0.02 |
Logistic regression | Age, PET flatness, PET major axis length, PET logarithm GLSZM size zone non-uniformity normalised, PET lbp-3D-m1 GLCM correlation, PET lbp-3D-m2 first order skewness | C: 1, penalty: l2, Solver: sag | 0.82 ± 0.002 | 0.79 ± 0.01 |
Random forest | Age | Bootstrap: true, Max depth: 1, min samples per leaf: 11, min samples per split: 48, number of estimators: 213 | 0.67 ± 0.004 | 0.64 ± 0.02 |
Multi-layer perceptron | Age, PET flatness, PET major axis length | Learning rate: invscaling, Solver: adam | 0.77 ± 0.004 | 0.75 ± 0.01 |
The K-nearest neighbours, single-layer perceptron and Gaussian process classifier models were over-fitted with the mean training and validation AUCs with > 0.10 difference between the two. l2, Ridge regression penalty; liblinear, a library for large linear classification; GLSZM, grey level size zone matrix; GLCM, grey level co-occurrence matrix; GLDM, grey level dependence matrix; rbf, radial basis function; L, low; H, high; Imc1, informational measure of correlation 1; Imc2, informational measure of correlation 2; idmn, inverse difference moment normalised; lbp, local binary pattern