Figure 10. Statistical modeling in PeptideProphet.
In addition to the database search score S, PeptideProphet models other discriminant features, e.g. ΔM, NTT, NMC, and the normalized ΔpI score. If the searched protein sequence database contains decoy sequences (optional), the modeling can be performed in a semi-supervised way in which the distributions of scores observed for decoy peptides help to derive the mixture components (histograms) for each of the scores used in the modeling (red: correct PSM; green: incorrect). The outcome of the modeling is the posterior probability P computed for each peptide to spectrum match.