Informed by assumptions, background knowledge and theory. |
Exploratory, data-driven, automatically learns from data. |
Typically use a small number of variables to predict probability of an outcome. |
May be more suited to handling a large number of predictors in data with high signal-to-noise ratio. |
Mainly linear effect of variables on outcome. |
More flexible, captures non-linear associations and interactions between variables, strategies required to reduce overfitting. |
Provide clinically informative relationships between variables and outcome, allows, for example, consideration of counterfactuals. |
Limited clinical interpretability, ‘black-box’ algorithms may lack face validity for clinicians, especially if large number of unintuitive predictors. |
Results often simply presented for end-user, for example, conversion to a score. |
Transparent presentation of results difficult. |
Can undertake model updating for use in populations with different baseline risk. |
Testing calibration and updating to new baseline risk difficult for many models. |