Table 3.
Top 10 most important features for machine learning methods in descending order. The parameters learned by the trained models (the coefficients of the hyperplanes separating the different classes for logistic regression; the degree of decision tree branching on each feature for XGBoost) can give an estimate of relative feature importance. Parentheses indicate the member of the trio (specimen, mother, father).
| Logistic regression | XGBoost | |
|---|---|---|
| 1 | Allelic depth (ref) (S) | Genotype (0/1) (S) |
| 2 | Allelic depth (alt) (S) | Phred likelihood (1/1) (S) |
| 3 | Phred likelihood (0/0) (S) | Allele count |
| 4 | Phred likelihood (1/1) (S) | Phred likelihood (0/1) (M) |
| 5 | Allelic depth (alt) (M) | Allelic depth (alt) (F) |
| 6 | Phred likelihood (0/0) (M) | Allelic depth (alt) (M) |
| 7 | Phred likelihood (0/1) (S) | Phred likelihood (0/1) (S) |
| 8 | Allelic depth (ref) (M) | Contamination |
| 9 | Read depth (S) | Allelic depth (ref) (M) |
| 10 | Phred likelihood (1/1) (M) | Phred likelihood (0/0) (F) |